SlideShare ist ein Scribd-Unternehmen logo
1 von 11
Downloaden Sie, um offline zu lesen
674                                                                                  IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011




        An Automatic Recommendation Scheme of TV
         Program Contents for (IP)TV Personalization
                          Eunhui Kim, Shinjee Pyo, Eunkyung Park, and Munchurl Kim, Member, IEEE



   Abstract—Due to the rapid increase of contents available under                   recommendation. For content searching, TV users usually input
the convergence of broadcasting and Internet, efficient access to                    the query words into search engines via graphical user interface
personally preferred contents has become an important issue.                        (GUI), and get the search results from which they finally select
In this paper, an automatic recommendation scheme based on
collaborative filtering is presented for intelligent personalization                 their preferred ones. The disadvantage of content searching is
of (IP)TV services. The proposed scheme does not require TV                         that the users do not even know the keywords to search what
viewers (users) to make explicit ratings on their watched TV                        they want. For content recommendation, it can be possible to
program contents. Instead, it implicitly infers the users’ interests                recommend to the users their preferred TV program contents.
on the watched TV program contents. For the recommendation                          Content recommendation based approach can greatly alleviate
of user preferred TV program contents, our proposed recom-
mendation scheme first clusters TV users into similar groups                         user’s burden to access their preferred TV program contents.
based on their preferences on the content genres from the user’s                    That is, the number of interactions to TV program GUI can be
watching history of TV program contents. For the personalized                       reduced which is often the case in the traditional (IP)TV envi-
recommendation of TV program contents to an active user, a                          ronments. We use the term “(IP)TV” as “IPTV and conventional
candidate set of preferred TV program contents is obtained via                      TV” in this paper.
collaborative filtering for the group to which the active user be-
longs. The candidate TV programs for recommendation are then                            Collaborative filtering (CF) has often been used to recom-
ranked by a proposed novel ranking model. Finally, a set of top-                    mend goods for e-commerce in Internet. The main idea of
ranked TV program contents is recommended to the active user.                       CF techniques is based on item preferences of similar users
The experimental results show that the proposed TV program                          [1]–[3]. However, the CF has the following characteristics: (1)
recommendation scheme yields about 77% of average precision                         it is often designed to recommend to the active users (the target
accuracy and 0.135 value of                  (Average Normalized
Modified Retrieval Rank) with top five recommendations for 1,509                      users for recommendation) the items which have not been
people.                                                                             purchased (consumed) before. This is not always appropriate
  Index Terms—Collaborative filtering, content based filtering, TV
                                                                                    for TV viewers because they are often likely to watch the same
personalization, TV program recommendation.                                         TV program series such as drama, series, and news etc which
                                                                                    are repeatedly broadcast; (2) high computational complexity is
                                                                                    caused to deal with many users and items.
                            I. INTRODUCTION                                             The content based filtering (CBF) has been used in informa-
                                                                                    tion filtering [1]–[3]. The CBF usually recommends items based
                                                                                    on the previously evaluated description by the active users. The
D      UE to the convergence of broadcasting and internet, the
       number of TV program contents available at user sides
is rapidly increasing and the accessibility to the TV program
                                                                                    weak points of CBF are as follows: (1) its recommendation is
                                                                                    restricted to the items that the active user has rated or consumed.
contents becomes an important issue in TV watching environ-                         In other words, it is over-specialized recommendation which
ments of traditional TV, IPTV or TV portals services. There-                        heavily depends on user’s consumption history on items; (2) it
fore, it is important for users (TV viewers) to easily find and                      also requires heavy computational complexity for reliable rec-
access their preferred ones from TV program contents available                      ommendation in analyzing various characteristics of the items
to their terminals. There are two approaches to the provision                       that users have consumed [6].
of user’s preferred TV program contents; content searching and                          In this paper, we present a personalized automatic recom-
                                                                                    mendation scheme of TV program contents based on CF. The
   Manuscript received August 31, 2010; revised February 11, 2011; accepted
                                                                                    proposed recommendation scheme consists of three parts—user
June 27, 2011. Date of publication August 08, 2011; date of current version         profile reasoning, user clustering and recommendation of TV
August 24, 2011. This work was supported by the R&D program of MKE/IITA             program contents. Unlike the traditional CF-based item recom-
(A1100-0801-3015, Development of Open-IPTV Technologies for Wired and
Wireless Networks).
                                                                                    mendation that requires explicit ratings on the purchased items
   E. Kim and E. Park are with the Dept. of Electrical Engineering at Korea Ad-     by users, the proposed recommendation scheme implicitly
vanced Institute of Science and Technology (KAIST), Daejeon 305-701, Korea          learns the user’s interest on the TV program contents and
(e-mail: lins77@kaist.ac.kr; epark@kaist.ac.kr).
   S. Pyo is with the Dept. of Information and Communications Engineering at
                                                                                    genres, which does not require user’s explicit ratings on their
KAIST, Daejeon 305-701, Korea (e-mail: sjpyo@kaist.ac.kr).                          watched TV program contents, thus making it more practical
   M. Kim is with the Dept. of Electrical Engineering at KAIST, Daejeon 305-        in real TV watching environments. For user clustering, sim-
701, Korea (e-mail: mkim@ee.kaist.ac.kr).                                           ilar user groups are clustered based on the feature vectors of
   Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.                                                      TV program genres computed from the usage history of the
   Digital Object Identifier 10.1109/TBC.2011.2161409                                watched TV program contents by users. Two methods of user
                                                                 0018-9316/$26.00 © 2011 IEEE
KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION                                        675



clustering are compared in this paper: demographic clustering        more frequently watched TV program contents by an active user
and -means clustering. Finally, CF -based recommendation is          are recommended in lower ranks. The proposed rank model in
performed with a novel ranking model which extends the Best          this paper tries to remedy this weak point of the previous rank
Match (BM) model [7]–[9] to rank the candidate TV program            models.
contents for recommendation. The proposed rank model is                 In the CF systems, for a large number of users, the process of
designed to make easy access to preferred TV program con-            grouping similar users and recommending items entails a com-
tents. For the recommendation of popular or newly broadcast          putational complexity issue [1], [3]. To solve the system over-
TV program contents, the popular TV program contents can             load in clustering similar taste users for a large number of users,
be identified via CF from similar user groups. On the other           G.R. Xue suggests a two-step clustering method by using of-
hand, newly broadcast TV program contents are preferably             fline -means clustering and online PCC (Pearson correlation
recommended by restricting not recently broadcast TV program         coefficient) clustering for the item ratings [16]. In this method,
contents outside a sliding time window in the history data of        the -means clustering is performed offline for a large number
watched TV program contents.                                         of users just one time for a given       value. Then more similar
   This paper is organized as follows: Section II reviews the pre-   users are extracted online for active users based on PCC values
vious related works for recommendation; Section III introduces       from their respective clusters to predict the rating values for the
the overall system architecture of our proposed automatic rec-       unrated items. However, this method needs to know an appro-
ommendation scheme for TV program contents and describes             priate cluster number a priori.
the data used for experiments; Section IV describes the compo-          For user clustering in this paper, we compare two clustering
nents of the proposed recommendation scheme in detail—user           methods: demographic clustering and -means clustering. The
profile reasoning, similar user clustering, and ranking of candi-     demographic clustering is very simple to only use user’s demo-
date TV program contents for recommendation; In Section V,           graphic information such as genders and ages for clustering. For
the proposed rank model is explained in detail; The experi-          the -means clustering, we use the feature vectors of user pref-
mental results are presented in Section VI; Finally, Section VII     erence values of 8 genres and 47 subgenres for TV program con-
concludes this work.                                                 tents. The former one is computationally very simple and can be
                                                                     a solution for the cold-start problem which takes time to learn
                     II. RELATED WORKS                               users, but requires a priori knowledge about the demographic
   PTV [4] adopted a hybrid method of CF and CBF to sup-             information. On the other hand, the latter one does not require
plement the item ramp-up problem of CF and the user ramp-up          such demographic information but implicitly clusters similar
problem of CBF [1]. It requires users to provide their preference    users based on the genre preference from the watched history
information on contents while enrolling. Based on this prefer-       data of TV program contents by users. For the -means clus-
ence information provided by the users, it creates and manages       tering, an appropriate number for can be found by searching
user profiles with explicit ratings by users. However, in general     a range based on dendrogram of hierarchical clustering [15],
users do not want to offer their personal information or some-       [17], [18]. We then determine a value based on the smallest
times do not faithfully exhibit their interests on the items with    sum of squared errors in this paper. The details of finding a right
explicit ratings. Pazzani et al. reported that only 15% of people        value are described in Section IV. And as a rank model for
respond to the request for the relevance feedback on their pref-     ordered recommendation of TV program contents, we propose
erence [5]. Therefore, requiring users to rate explicitly on items   a novel rank model based on the BM model [7]–[9]. The pro-
is one of the main reasons that cause rating sparsity problems       posed rank model is described in Section V in detail.
[3].                                                                    We can summarize the contribution points of our personal-
   Deshpande M. et al. proposed an item based top- recom-            ized automatic recommendation scheme for TV program con-
mendation algorithm [11] and J. Wang et al. [12] proposed an         tents as follows: (1) it is more appropriate for TV program rec-
extension to “relevance model” from language model [13], both        ommendation since it makes implicit reasoning for user prefer-
of which utilize CF with user-profile and item matching for           ence on TV program contents from the watched TV program
item recommendation. The item based top- recommendation              history data, which does not require users to explicitly rate their
algorithm uses an item-to-item matrix for item recommenda-           watched TV program contents; (2) it takes into account not only
tion which is computed based on a user-to-item matrix [11] for       the group preferences but also the individual user’s preferences
which the recommendation performance is further improved by          on TV program contents for recommendation; and (3) the pro-
extending a language model to a relevance model [12].                posed rank model elaborates collaborative filtering by consid-
   In item recommendation of e-commerce, recommender sys-            ering the relative lengths of watching times for TV program con-
tems tend to suggest new items to the users because they are         tents, not just by simply counting the number of users who have
not likely to repurchase the same items or similar kinds after       watched them.
they have bought them. However, this may not be appropriate
in TV environments where TV viewers are expected to watch              III. ARCHITECTURE OF THE PROPOSED RECOMMENDATION
(consume) the TV program contents (items) that they have been                      SYSTEM AND EXPERIMENT DATA
accustomed to watch. In general, TV viewers tend to watch pop-
ular TV program contents as their similar taste users do or spe-     A. Proposed Recommendation System Scheme
cific TV program contents according to their individual pref-            In this paper, it is assumed that TV terminals are connected
erences. So, the previous two models are insufficient in that the     to the content servers of TV programs via back channels so that
676                                                                          IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011



                                                                                                         TABLE I
                                                                                           FIELDS OF TV USAGE HISTORY DATABASE




                                                                            data set of 2,005 people is used which has been collected on
                                                                            6 terrestrial TV channels for 6 months from Dec. 1, 2002 to
                                                                            May 31, 2003. Table I shows the data fields of the usage history
                                                                            data set for watched TV program contents. The TV program
                                                                            contents in the history data set have 8 main genres which are
                                                                            further divided into 47 subgenres in total. For the data set,
                                                                            the total number of TV program titles amounts to 924 and the
                                                                            total number of subtitles is 1,855. We use 795 TV program
Fig. 1. Architecture of the proposed recommendation system for TV program   contents for training, corresponding to the first 4 months and
content.                                                                    629 TV program contents for testing, corresponding to the
                                                                            last 2 months. Notice that the sum (1,424) of the TV program
                                                                            contents for training and testing exceeds the total number (924)
the usage (or watching) history of (IP)TV program contents can              of the titles because 500 watched TV program contents are
be collected at the server sides. In IPTV environments, TV pro-             the titles that were repeatedly broadcast. Table I shows the
gram contents are streamed over IP networks and the respon-                 information attached to the broadcast TV program contents.
sible content providers at head-end sides can collect usage his-
tory of TV programs watched by the users via back channels.                           IV. PROPOSED RECOMMENDATION SCHEME
Fig. 1 shows the architecture of our proposed automatic recom-
mendation system for TV program contents. The automatic rec-                A. User Profile Reasoning
ommendation scheme consists of three agents: (1) the user pro-
                                                                               In this paper, a user is characterized in terms of his/her pro-
file reasoning agent computes user preferences on genres and
                                                                            file which consists of two preferences on items (TV program
TV programs by analyzing user’s watching history of TV pro-
                                                                            contents) and genres. First of all, we remove from the usage
gram contents. So, this agent collects TV usage history from
                                                                            history data set all the TV program contents that have not been
local repositories of TV terminals for user profile reasoning; (2)
                                                                            watched for more than 10% of their respective total lengths. The
the user clustering agent clusters the users (TV viewers) into
                                                                            preference on a TV program content is defined as the relatively
similar preference user groups; (3) the recommendation agent
                                                                            watched ratio over the total time length. For the reruns of TV
recommends to each active user a list of his/her preferred TV
                                                                            program contents, they are all considered the same title (item).
program contents. Here, an active user means the user who logs
                                                                            The preference       on item by user is defined as
into the TV terminal and is ready to receive a recommended TV
program list.
   For recommendation, a list of candidate TV program contents                                                                            (1)
is extracted based on CF and our proposed rank model calculates
the respective scores of the candidate TV program contents for
                                                                            where      is the number of times being broadcast for an item .
ranked recommendation. Then the TV program contents with
                                                                            And      is defined as
the top     highest scores are presented to the active user in a
descending rank order as the result of recommendation. Notice
in this paper that the users and items are interchangeably used                                                                           (2)
with the TV viewers and TV program contents, respectively.
                                                                            where and          indicate the watched time length for item by
B. Description of Usage History Data Set for Watched TV
                                                                            user and the total length of an item , respectively. It must be
Program Contents
                                                                            pointed out in (1) that the item preference might be inaccurately
   For the experiments to test the effectiveness of the proposed            computed for inattentively watched TV program contents. The
recommendation scheme, Neilson Korea’s TV usage history                     treatment of them is out of scope in this paper.
KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION                                        677



   Since the popularity or recency of TV program contents are                                  TABLE II
often diminished with time and the user’ interest on TV pro-                    SELECTION RESULTS OF CLUSTER NUMBERS, K
gram contents varies over time, it is more appropriate to reflect
the recently watched TV program contents for recommendation.
Therefore, a time window function        is defined as

                                                             (3)

where is a control parameter for the window size which is set      and reveal the characteristics of cluster distributions, we take
to two-month or four-month length in this paper.                   a two-step approach: an unsupervised hierarchical clustering is
   The average of user preference on item by user is given         first run to construct a dendrogram for which a range of values
by                                                                 is found by cutting its branches at the large jumps in a distance
                                                                   criterion [14], [15], [17], [18]; the final      value is then deter-
                                                             (4)   mined in the range by repeatedly performing -means clus-
                                                                   tering. To determine the final        value, -means clustering is
                                                                   repeated        times for which the centroids of        clusters are
where        is the total number of items in the watched TV pro-
                                                                   randomly initialized each time. When the clustering yields the
gram list       by user .
                                                                   same clustering results         times for a given value, the clus-
   In order to efficiently perform similar user clustering in low
                                                                   tering results become the final clusters with the value. When
dimension, genre preference is used which can reflect the sim-
                                                                   any      does not result in the same clustering results less than
ilarity of user’s content consumption for TV program contents.
                                                                          times, the     value that results in the same clustering re-
Genre preference is computed by accumulating the item pref-
                                                                   sults the largest times is selected as the final value. In this paper,
erence values for the genre and is then normalized by the total
                                                                          and         are set to 1,000 and 5, respectively. Table II
genre preference values for all genres. When the total number
                                                                   shows       ranges and finally selected       values for the features
of genres is       , the genre preference    on genre by user
                                                                   vectors of 47 subgenres and 8 main genres for the watched TV
is defined as
                                                                   program contents by the users who have watched at least 33%
                                                                   of the average number of watched TV program contents during
                                                             (5)   the training period. For this experiment, the open-source code
                                                                   Cluster 3.0 was used in [14].
                                                                      The K-means clustering, which is the most time consuming
                                                                   task in our scheme, takes less than one minute for
                                                                          on a PC with Intel Core 2 Quad CPU 2.4 GHz and 2 GB
B. User Clustering                                                 memory.
   For computational efficiency and effectiveness of collabora-
tive filtering, TV users are clustered into similar user groups.    C. Recommendation of TV Program Contents
After clustering, each user has a membership to one of the user       In order to recommend the preferred TV program contents
groups. Therefore, CF operation for an active user is performed    to an active user, the recommendation process consists of three
for the user group to which the user belongs, not for the whole    steps: extracting similar preference users from the clusters (sim-
users.                                                             ilar user groups) to which the active user belongs; selecting
   For similar user grouping, two clustering approaches are        candidate items for recommendation; and ranking the candidate
compared: demographic clustering based on genders and ages,        items. Especially the rank model will be explained in Section V
and -means clustering based on the genre preference as             in details.
described in (5). For -means clustering, two feature vectors          1) Selecting Similar Preference Users of an Active User in a
are compared with 8 preferences on the main genres and 47          Group: In Section IV-B, clustering the similar preference users
preferences on the subgenres, respectively.                        is done offline. For recommendation, more relevant users are
   The demographic clustering is computationally very simple       further extracted to construct a set of candidate TV program
but can only be used if the demographic information such as        contents based on CF for the user group to which an active user
genders and ages is available. The demographic clustering can      belongs. By doing so, the computation complexity is lowered by
avoid the cold-start clustering problem that usually takes time    reducing the number of all users to the number of similar peer
while learning the users. In our demographic clustering, there     users in the similar user group to which the active user belong.
are 26 combinations of different genders and ages. The genders        Based on the         proximity measure, the most           peer
are divided into two classes—male and female and the ages are      users with similar preference are extracted for the active user.
divided into 13 classes—                                   , and   For the        proximity measure, the normalized correlation is
66 ages and higher.                                                computed by subtracting the average preference value from all
   As an alternative, -means clustering can be used, which         the preference values [10].
does not require the demographic information. The essential           On the basis of the consumed (watched) item (TV program
prerequisite for -means clustering is to know an appropriate       contents) list      of an active user , the similarity between
number as the number of clusters. In order to find a right              and each peer user is measured as the proximity between
678                                                                   IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011



the normalized preference values on items for     and      in the    two conditions: one is that the weight is independent of term
similar preference groups. The similarity is defined as               frequency; and the other is that the weight is linear with
                                                                     term frequency. Each condition is satisfied as                 and
                                                                                 [9]. But, the second condition is not always satisfied
                                                                     as           . To remedy this, a scaling factor       is added in
                                                                     the numerator, thus resulting in                                 .
                                                                     This is taken into account in the BM11, 15 and 25 models [7].
                                                               (6)
                                                                        The BM25 includes an inappropriate condition for TV pro-
where is an item belonging to         .     represents the active
                                                                     gram recommendation since it gives a high weight on short doc-
user’s profile and       indicates the preference value on item
                                                                     uments compared to long documents by scope hypothesis [7].
   consumed by user as in (1), and          is the averaged item
                                                                     Therefore, the proposed rank model in this paper extends the
preference value of user as in (4).
                                                                     BM15 model which does incorporate the scope hypothesis into
   The users with                      are only regarded as rel-
                                                                     its rank model so that the TV program contents that were broad-
evant users to the active user. Then CF is performed for the
                                                                     cast less times are prevented from being higher-ranked.
item lists between the active user and each of the relevant users.
Since the number of similar preference users affects precision       B. Proposed Rank Model
performance, we need to find an optimal number of peer users
based on the average precision accuracy, which is explained in          An extension to the BM15 is made by taking into account
Section VI.                                                          the collaborative filtering concept that accounts for the watching
   2) Filtering Candidate Items With EPG Information: After          times of users in the rank model for recommendation of TV pro-
selecting the relevant users for an active user, their preference    gram contents. Furthermore, we add to the rank model a weight
items become the candidate items for recommendation. But             with the correlation between candidate items for recommenda-
some items may not be available in TV channels due to the            tion and the items watched by the active user. We score the fil-
termination of broadcasting for the TV program contents. In          tered candidates of TV program contents in a ranked order. The
case of linear TV broadcasting services, Electronic Program          relations between candidate document and query in BM15
Guide (EPG) information can be used to filter out the candidate       are translated into the relations between candidate TV program
TV program contents which are not available.                         contents for recommendation and the active user .
   3) Ranking Items: After a set of candidate TV program con-           To make the BM15 be applicable for recommendation of TV
tents for recommendation is determined, they are ordered by a        program contents, we have the following assumptions: (1) the
rank model. Finally, the recommended TV program contents are         watched TV program list          represents the active user ; (2)
presented to active users in the descending order of rank scores.       is transformed into the relative watching frequencies of both
The proposed rank model is described in the following section.       TV program contents         of similar preference users and
                                                                            of an active user by applying CF concept, where
                                                                            indicates the watched TV program contents by ; (3)
                 V. PROPOSED RANK MODEL
                                                                          is regarded as the relative watching frequency of by ;
A. Related Work—BM Model                                             (4) the similarity between and by             is further taken into
                                                                     account. The matching score between and               is defined as
  Our proposed ranked model extends the Best Match (BM)              our proposed rank model by
model [7]–[9]. The BM is a ranking function used by retrieval
engines to rank matching documents according to their rele-
vance to a given query. The BM model is given by

                                                              (7)
                                                                                                                                    (9)

where                                                                where and are used to balance the term frequency and
                                                                     the query term frequency        in the rank model. The Robertson
                                                              (8)    et al. analyzed the way of weighting in details [7].         and
                                                                         are set to 200 and 0.2 empirically in this paper. In (9),
                                                                                          indicates the relative watching frequency
In (8), is the number of total documents, is the number of
                                                                     which is the ratio of the total number of watching times of both
documents including a specific term of query, is the number of
                                                                     programs and over the total number of watching times of
documents related with a specific topic, and is the number of
                                                                     the TV program contents (all ’s) by the               peer users.
documents including a specific term of query and is related with
                                                                     Therefore, the relative watching frequency is calculated as
the specific topic [8]. In (7), is term frequency in documents
and      is term frequency in the query.
   The BM model originates from two Poisson models that
the term frequency is independent of relevant and irrelevant                                                                       (10)
documents [9]. Based on this idea, the simple formation
                              is suggested under the following
KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION                                          679



                                                                       items. There are two users,       and , who have watched two
                                                                       items (TV program contents) and , and the similarity (VCC)
                                                                       value between and is 0.643. On the contrary, for the two
                                                                       users ( and ) who have watched both and , the VCC
                                                                       value between and is 0.4. So, if we set 0.5 of the VCC value
                                                                       as a threshold for the similarity between items, then the items
                                                                       and are considered being “similar”, but and are not sim-
                                                                       ilar. So,     in (9) can improve the rank model by taking into
                                                                       account the relation between the active user and the candidate
                                                                       items for the score calculation. The effect of       on precision
                                                                       performance will be shown in Fig. 5 in Section VI.

                                                                                        VI. EXPERIMENTAL RESULTS
                                                                         For the usage history of watched TV program contents ex-
                                                                       plained in Section III-B, we use the usage history data of four
                                                                       months for training and the remaining two months for testing.
Fig. 2. Illustration for significance on weights w   .                    In this experiment section, we measure the performance of
                                                                       our recommendation scheme in terms of both precision/recall
                                                                       and Average Normalized Modified Retrieval Rank (ANMRR)
In (9),              indicates the ratio of the total number of        which considers the rank orders in retrieval [19]–[21].
watching times of TV program contents over the total number
of watching times for all the TV program contents (all ’s) by          A. Performance Measure of Rank Models
   , and is given by
                                                                          Precision and Recall: The performance in information re-
                                                               (11)    trieval is usually measured in terms of precision and recall [22].
                                                                       The precision is defined as the ratio of how many watched TV
                                                                       program contents (relevant documents) are contained in the rec-
                                                                       ommendation list (retrieved documents) of TV program con-
Eq. (9) can be explained intuitively as follow:                        tents for an active user. The recall is defined as the ratio of
is regarded as the peer users’ preferences on TV program               how many recommended TV program contents (retrieved doc-
contents in the same user group to which            belongs; and       uments) are actually included in the watched TV programs (rel-
               is referred to as the active user’s preference on       evant documents) for the active user. The precision and recall
TV program contents. The two terms                            and      are defined as.
               are in mutually supplemental relation as being
multiplied together.                                                                                                                 (14)
   In (9),                . Two weights        and      are given
as
                                                                                                                                     (15)
                                                               (12)
                                                                       where                          is the number of watched TV pro-
                                                                       gram contents in the recommended list of TV program contents
                                                               (13)    and              is the number of recommended TV program con-
                                                                       tents.                        is the number of recommended TV
In (12),      indicates the total number of broadcast times for all    program contents in the watched list of TV program contents,
   items and is the number of broadcast times of each item.            and               is the number of watched TV program contents.
      reflects the inverse document frequency with independence         For the recommendation of TV program contents, the precision
assumption between the documents with and without the terms            is a more appropriate metric for performance evaluation than the
[8]. In this paper, it is assumed that the document for retrieval      recall since the recall accuracy is increased as the number of rec-
is     and the specific term of query is from active user profile        ommended TV program contents increases. In this regard, rec-
            .      is added as a weight for the similarity between     ommending a larger number of TV program contents increases
    and which is calculated as vector cosine correlation simi-         false positives. So, in this paper, we use precision accuracy for
larity           in (13) for which the       and     are the feature   performance evaluation. However, if the number of ground truth
vectors of user preference on program          and , respectively.     increases, the precision also becomes higher. So, the perfor-
This weight puts more emphasis on the active user’s personal           mance of the rank models is measured in terms of precision and
preference on TV program contents, which is not reflected in            recall.
the original BM [7], [8].                                                 ANMRR: Compared to precision measure, another perfor-
   In order to see the effectiveness of          in (13), Fig. 2 il-   mance measure, ANMRR [19]–[21], is considered which has
lustrates an example of           similarity measures between two      been developed to measure the image retrieval performance in
680                                                                              IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011




Fig. 3. Preferences on genres and channels for groups: Demographic Clustering (DM) vs.   K -means clustering (KM). (a) Genre preferences of groups. (b) Channel
preferences of groups.



MPEG-7 [20]. The ANMRR indicates not only how many cor-                           A cluster ’s preference on a specific genre is computed by
rect items are recommended but also how highly more relevant                   accumulating the preferences on the specific genre by all users
items are ranked among the recommended items. For ANMRR,                       in the same cluster, and then it is normalized by the total number
Normalized Modified Retrieval Rank (NMRR) is defined as                                 of users in the cluster . Similarly, the normalized pref-
                                                                               erence on a channel can also be computed for each cluster. The
                                                                               preferences       and      on genre and channel for a cluster
                                                                                   are calculated as

                                                             (16)
where           is the number of recommended TV program con-                                                                                             (19)
tents that the active user has really watched longer than the av-
erage watching times of his/her preferred TV program contents
during test period.         is the allowable maximum rank and                                                                                            (20)
is computed as                                             where
        is the maximum of             [21]. And the            in
(16) is revised by                                                             where        is the total number of users in the cluster .  and
                                                                                    are the total numbers of genres and channels, respectively.
                                                                      (17)        Fig. 3 shows the profiles of clusters’ preferences on genres
                                                                               and channels of TV programs. As shown in Fig. 3, the genre
                                                                               preferences are not significantly distinguished among different
where          is the rank ordered in score values by the pro-                 groups by demographic clustering (DM). On the other hand, the
posed rank model in this paper. Finally ANMRR is written as                    groups by -means clustering (KM) show somewhat different
follows:                                                                       patterns for genre preferences among them. This is also sim-
                                                                               ilarly observed for the channel preferences except the group4
                                                                      (18)     and group5 by DM.
                                                                                  Table III shows the average precision performance for dif-
                                                                               ferent numbers of groups by DM and KM. Although the pref-
                                                                               erences on genres and channels are better distinguished by KM
B. Clustered Data Analysis
                                                                               than DM for different groups, the performance difference of av-
  As explained before, two clustering methods are compared                     erage precision between DM and KM is very slight. In this ex-
between the demographic clustering and -means clustering.                      periment, the KM turns out to be effective for recommendation
KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION                                           681



                           TABLE III                                                                 TABLE V
      COMPARISONS OF AVERAGE PRECISION BETWEEN DM AND KM                             PRECISION ACCURACY WITH OUTLIER REMOVALS




  outlier criteria (33%); refer to Table V.
   the number of peer user is 5; refer to Fig. 8.                          the number of cluster is 26 by DM; refer to Table IV.
                                                                            the number of peer user is 5; refer to Fig. 8.
                            TABLE IV
    AVERAGE PRECISION PERFORMANCE FOR THE NUMBER OF CLUSTERS




  outlier criteria (33%); refer to Table V.
   the number of peer user is 5; refer to Fig. 8.




                                                                         Fig. 5. Performance comparison with/without w     .



                                                                         o’clock” and “Let’s marry”. For both TV program contents,
                                                                         there are relatively large numbers of users who have watched
                                                                         them less than 10% or more than 95% of the total TV program
                                                                         lengths, respectively. This pattern is similarly observed in other
                                                                         TV program contents. So, we set 10% of the total length of TV
                                                                         program contents as a threshold for outlier removal.
                                                                            Table V shows the average precision performance on different
                                                                         thresholds of outlier removal for the second case. With the ex-
                                                                         clusion of users who watched the TV program content less than
                                                                         the 33% of the average number of watched TV program con-
Fig. 4. Number of users versus relative watching lengths of TV program
contents.                                                                tents, we obtain 76.6% of average precision accuracy for the
                                                                         Top-5 recommendation. The threshold values in Table V indi-
                                                                         cate the ratios of the number of watched TV program contents
although though it does not utilize the demographic information          by each user over the average number of watched TV program
for clustering.                                                          contents by all users during the training period of 4 months. The
   Table IV shows the average precision performance on Top-              average numbers of watched TV program contents by all users
recommendations for different numbers of clusters (groups) by            are 124 and 90 during the training period of 4 months and the
DM. Increasing the number of clusters does not enhance the               testing period of 2 months, respectively.
precision performance because we only use the most similar
         users to an active user of his/her group (The average           D. Effect of        on Precision Performance of the Proposed
precision performance according to the number of peer users              Rank Model
will be shown in Fig. 8).                                                   Fig. 5 shows the performance comparison in terms of average
                                                                         precision accuracy for the recommended TV program contents
C. Exclusion of Nosy Items and Outliers of Users                         with and without        in (9).
   For the experiments, two kinds of outliers are removed to have           The average precision accuracies with       in the proposed
reliable recommendation: firstly, the TV program contents that            rank model are higher than those without it. The average preci-
were watched less than 10 % of their respective whole lengths            sion in this experiment is measured with 67 active users.
are removed as noise; secondly, the users who have watched TV
program contents less than a predefined TV watching times are             E. Performance Comparison Between Proposed Rank Model
excluded.                                                                and Linear Model
   Fig. 4 shows the number of users versus the relative watching           For performance comparison in precision and ANMRR be-
lengths for the two TV program contents—“Hometown at 6                   tween the proposed rank model and the linear model [12], 90
682                                                                   IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011




Fig. 6. Performance comparison in precision and recall.




                                                                     Fig. 8. Precision accuracies versus different numbers of similar peer users.
                                                                     (a) Precision accuracies with Top-5 recommendations. (b) Precision accuracies
                                                                     with four clusters.

Fig. 7. Performance comparison in ANMRR.
                                                                       3) It takes into account the number of watching times for both
                                                                          TV program contents and by peer users in the score
users are randomly selected. Figs. 6 and 7 show the perfor-               calculation for ranking. This more elaborates collaborative
mance comparisons of the proposed model and linear model in               filtering. On the other hand, the linear rank model [12]
precision-recall and ANMRR, respectively. The proposed rank               simply counts the number of peer users who have watched
model outperforms the linear model in both precision-recall and           both      and .
ANMR.
   Notice that the smaller the ANMRR value is, the better            F. Performance Analysis for Proposed Recommendation
the recommendation performance is. Ideally, the case of              Scheme
                 is achieved when the ranked order of the rec-          We investigate the performance of the proposed recommen-
ommended TV program contents is perfectly matched with               dation in terms of the number of clusters, the number of sim-
the order of the watched TV program contents by the active           ilar peer users and the number of TV program contents for final
user during the test period. Therefore the recommended TV            recommendation. Fig. 8 shows precision performance for dif-
program contents by the proposed rank model are also better          ferent numbers of similar peer users given Top-5 recommenda-
matched in ranked orders than the linear model.                      tions and 4 clusters.
   The superiority of our proposed rank model comes from the            In Fig. 8(a), the average precision performance slightly de-
facts that:                                                          creases as the number of similar peer users increases for dif-
  1) The proposed rank model defines the weight             in (12)   ferent numbers of clusters. This is because a smaller number
     such that more frequently broadcast TV program contents         of similar peer users yields more correlation between the ac-
     are put in lower ranks;                                         tive user and peer users so that the resulting recommendation
  2) For recommendation of TV program contents, the tradi-           precision is usually enhanced. When the number of clusters in-
     tional models usually intensify the preference of the peer      creases, the resulting recommendation precision seldom varies
     users but relatively reduce the preference of an active user,   for different numbers of recommended items, because the larger
     which might be appropriate to recommend unpurchased             the number of clusters, the more correlate the clustered users
     items to active users in e-commerce environments. How-          are. In Fig. 8(b), the average precision performance of the pro-
     ever, in TV environments, users often tend to watch the         posed recommendation scheme becomes lowered as the number
     TV program contents that they used to watch. Therefore, it      of recommended TV program contents increases.
     is reasonable to take into consideration the preferences of        Table VI shows the 19 recommended TV program contents
     both similar users and an active user for recommendation.       by the proposed rank model for the corresponding ground truth
     The proposed rank model actually considers both;                items out of 67 for an active user with                     . As
KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION                                                              683



                          TABLE VI                                                [2] R. Bruke, “Hybrid recommender systems: Survey and experiments,”
  RECOMMENDATION RESULTS AND GROUND TRUTH FOR AN ACTIVE USER                          User Modeling and User-Adapted Interaction, vol. 12, no. 4, pp.
                    WITH ID = 213039903                                               331–370, Nov. 2002.
                                                                                  [3] G. Adomavicius and A. Tuzhilin, “Toward the next generation of rec-
                                                                                      ommender systems: A survey of the state-of-the-art and possible ex-
                                                                                      tensions,” IEEE Trans. Knowl. Data Eng., vol. 17, no. 6, pp. 734–749,
                                                                                      Jun. 2005.
                                                                                  [4] P. Cotter and B. Smyth, “PTV: Intelligent personalized TV Guides,”
                                                                                      Amer. Assoc. AI, pp. 957–964, 2000.
                                                                                  [5] M. Pazzani and D. Billsus, “Learning and revising user profiles: The
                                                                                      identification of interesting web sites,” Machine Learning, vol. 27, pp.
                                                                                      313–331, 1997.
                                                                                  [6] N. Good, J. B. Schafer, J. A. Konstan, A. Borchers, B. Sarwar, J.
                                                                                      Herlocker, and J. Riedl, “GroupLens research project, combining col-
                                                                                      laborative filtering with personal agents for better recommendations,”
                                                                                      Amer. Assoc. AI, 1999.
                                                                                  [7] S. E. Robertson, S. Walker, M. Beaulieu, M. Gatford, and A. Payne,
                                                                                      “Okapi at TREC-4,” in 4th Text Retrieval Conf. (TREC-4), 1995, pp.
                                                                                      73–96.
                                                                                  [8] S. E. Robertson and K. Spark Jones, “Relevance weighting of search
                                                                                      terms,” J. Amer. Soc. Inf. Sci., vol. 27, pp. 129–146, 1976.
                                                                                  [9] S. E. Robertson and S. Walker, Some Simple Effective Approximations
                                                                                      to the 2-Poisson Model for Probabilistic Weighted Retrieval. New
                                                                                      York: Springer-Verlag, 1994, pp. 232–241.
                                                                                 [10] P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl, “Grou-
                                                                                      plens: An open aArchitecture for collaborative filtering of netnews,” in
                                                                                      ACM Conf. Comput. Supported Cooperative Work, 1994, pp. 175–186.
                                                                                 [11] M. Deshpande and G. Karvpis, “Item-based top-N recommendation
                                                                                      algorithms,” ACM Trans. Inf. Syst., vol. 22, no. 1, pp. 143–177, Jan.
                                                                                      2004.
                                                                                 [12] J. Wang, J. Powelse, J. Fokker, A. Vreies, and M. Reinders, “Person-
                                                                                      alization on a peer-to-peer television system,” Multimedia Tools Appl.,
                                                                                      vol. 36, no. 1/2, pp. 89–103, 2007.
 # recommendation order, ## preference order of the active user.                 [13] J. Lafferty and C. Zhai, “Probabilistic relevance models based on doc-
                                                                                      ument and query generation,” Language Modeling Inf. Retrieval, 2002.
                                                                                 [14] M. J.L. De Hoon, S. Imoto, J. Nolan, and S. Miyano, “Open source
                                                                                      clustering software,” Bioinfomatics, p. 781, 2004.
aforementioned, the more frequently watched TV program con-                                                                                         K
                                                                                 [15] D. P. Vetrov and L. I. Kuncheva, “Evaluation of stability of -means
tents such as daily news, daily soap opera and weekly regular                         cluster ensembles with respect to random initialization,” IEEE Trans.
                                                                                      PAMI, vol. 28, no. 11, pp. 1798–1808, 2006.
drama are shown to appear higher-ranked. So this can help ac-                    [16] G. Xue, C. Lin, Q. Yang, W. Xi, H. Zeng, Y. Yu, and Z. Chen, “Scalable
tive users easily to access their frequently watching TV program                      collaborative filtering using cluster-based smoothing,” in ACM SIGIR,
                                                                                      Aug. 2005, pp. 114–121.
contents. On the other hand, the low-ranked items by the pro-                    [17] T. Sergios and K. Konstantions, Pattern Recognition, 3rd ed. New
posed rank model are the TV program contents that were not                            York: Academic Press, 2006, pp. 572–582.
often or never watched by the active user but frequently watched                 [18] R. Duda, P. Hart, and D. Stork, Pattern Classification, 2nd ed. New
                                                                                      York: Wiley-Interscience, 2001, pp. 542–559.
by his/her peer users via the incorporation of collaborative fil-                 [19] B. S. Manjunath, J.-R. Ohm, V. V. Vasudevan, and A. Yamada, “Color
tering into recommendation.                                                           and texture descriptors,” IEEE Trans. Circuits Syst. Video Technol., vol.
                                                                                      11, no. 6, pp. 703–715, Jun. 2001.
                                                                                 [20] P. Ndjiki-Nya, J. Restat, T. Meiers, J.-R. Ohm, A. Seyferth, and R.
                                                                                      Sniehotta, “Subjective evaluation of the MPEG-7 retrieval accuracy
                          VII. CONCLUSION                                             measure (ANMRR),” in ISO/WG11 MPEG Meeting, Geneva, Switzer-
                                                                                      land, May 2000, Doc. M6029.
   In this paper, we propose an automatic recommendation                         [21] W. Ka-Man and P. Lai-Man, “MEPG-7 dominant color descriptor
                                                                                      based relevance feedback usingmerged palette histogram,” in IEEE
scheme of (IP)TV program contents for TV personalization.                             Int. Conf. Acoust., Speech, Signal Process., May 2004, vol. 3, pp.
Unlike the tradition recommendation in document retrieval or                          433–436.
                                                                                 [22] C. D. Manning, P. Raghavan, and H. Schütze, Introduction to Informa-
e-commerce, the proposed scheme does not require the explicit                         tion Retrieval. Cambridge, U.K.: Cambridge Univ. Press, 2008, pp.
ratings on watched TV program contents, rather making im-                             151–175.
plicit reasoning for user preference on TV program contents in
the usage history data of watched TV program contents. The
rank model in the proposed scheme takes into account not only
the group preferences but also the active user’s preferences on                                        EunHui Kim received the B.E. degree in infor-
TV program contents. Furthermore, the proposed rank model                                              mation and communications engineering from
elaborates collaborative filtering by considering the relative                                          Chungnam National University in 2000 and the
                                                                                                       M.Sc. degree in information communications engi-
lengths of watching times for TV program contents, not just by                                         neering from Korea Advanced Institute and Science
simply counting the number of users who have watched them.                                             Technology (KAIST), Daejeon, Korea in 2009. She
Our proposed recommendation scheme shows the effectiveness                                             is currently pursuing the Ph.D. degree in Department
                                                                                                       of Electrical Engineering at KAIST.
with rich experimental results for a real usage history dataset                                           She worked for Samsung Electronics as an As-
of watched TV program contents.                                                                        sistant Engineer of Software team in Visual Display
                                                                                                       Division during 2000-2003 in Suwon, Korea and as
                              REFERENCES                                      an Associate Engineer of Architecture team in Digital Solution Center during
                                                                              2003–2007 in Seoul Korea. Her research interests include personalization in
   [1] M. Montaner, B. Lopez, and J. L. DE Larosa, “A taxonomy of recom-      connected TV, data clustering, collaborative filtering, and recommendation
       mender agents on the Internet,” AI Rev., vol. 19, pp. 285–330, 2003.   modeling with AI for smart TV interaction.
684                                                                IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011



      Shinjee Pyo received the B.E. degree and the                                           Munchurl Kim (M’07) received the B.E. degree in
      M.Sc. degree in information and communications                                         electronics from Kyungpook National University,
      engineering from KAIST, Daejeon, Korea, in 2007                                        Korea in 1989, and the M.E. and Ph.D. degrees in
      and 2009, respectively. She is currently pursuing                                      electrical and computer engineering from University
      the Ph.D. degree in information and communica-                                         of Florida, Gainesville, Florida, in 1992 and 1996,
      tions engineering at KAIST. Her research interests                                     respectively.
      include Personalization in Connected TV, sequential                                       After his graduation, he joined Electronics and
      pattern mining for TV personalization and pattern                                      Telecommunications Research Institute (ETRI)
      recognition.                                                                           where he had led Broadcasting Media Research
                                                                                             Team and Realistic Broadcasting Research Team,
                                                                                             and had worked in the MPEG-4/7 standardization
                                                                  related research areas. In 2001, he joined, as Assistant Professor in School of
                                                                  Engineering, the Information and Communications University (ICU) in Taejon,
                                                                  Korea. Since 2009, he is Associate Professor in Department of Electrical
      Eunkyung Park received the B.E. degree in in-               Engineering at KAIST, Daejeon, Korea. His research areas of interest include
      formation and communications engineering and                2D/3D video coding, 3D video quality assessment, pattern recognition and
      the M.Sc. in electrical engineering from KAIST,             machine learning, and video analysis and understanding.
      Deajeon, Korea, in 2009 and 2011, respectively.
         Now she joins NAVER which is the first and largest
      search portal in Korea and is working with business
      and planning for web portal services. Her research in-
      terest is statistical learning theory, social networking,
      and machine learning.

Weitere ähnliche Inhalte

Was ist angesagt?

Cooperative Wireless Communications
Cooperative Wireless CommunicationsCooperative Wireless Communications
Cooperative Wireless Communicationsmravendi
 
Wireless communication and Network
Wireless communication and Network Wireless communication and Network
Wireless communication and Network Chanaka Lasantha
 
Dynamic Sub-Channel Allocation in Multiuser OFDM Systems to Achieve Variable ...
Dynamic Sub-Channel Allocation in Multiuser OFDM Systems to Achieve Variable ...Dynamic Sub-Channel Allocation in Multiuser OFDM Systems to Achieve Variable ...
Dynamic Sub-Channel Allocation in Multiuser OFDM Systems to Achieve Variable ...IDES Editor
 
Speaker Search and Indexing for Multimedia Databases
Speaker Search and Indexing for Multimedia DatabasesSpeaker Search and Indexing for Multimedia Databases
Speaker Search and Indexing for Multimedia DatabasesGihan Wikramanayake
 
Performance Analysis of Wireless Networks With MDQOS
Performance Analysis of Wireless Networks With MDQOSPerformance Analysis of Wireless Networks With MDQOS
Performance Analysis of Wireless Networks With MDQOSIJERA Editor
 
Opportunistic and playback sensitive scheduling for video streaming
Opportunistic and playback sensitive scheduling for video streamingOpportunistic and playback sensitive scheduling for video streaming
Opportunistic and playback sensitive scheduling for video streamingijwmn
 

Was ist angesagt? (8)

Cooperative Wireless Communications
Cooperative Wireless CommunicationsCooperative Wireless Communications
Cooperative Wireless Communications
 
Wireless communication and Network
Wireless communication and Network Wireless communication and Network
Wireless communication and Network
 
Dynamic Sub-Channel Allocation in Multiuser OFDM Systems to Achieve Variable ...
Dynamic Sub-Channel Allocation in Multiuser OFDM Systems to Achieve Variable ...Dynamic Sub-Channel Allocation in Multiuser OFDM Systems to Achieve Variable ...
Dynamic Sub-Channel Allocation in Multiuser OFDM Systems to Achieve Variable ...
 
Television Transmission Technologies
Television Transmission TechnologiesTelevision Transmission Technologies
Television Transmission Technologies
 
Speaker Search and Indexing for Multimedia Databases
Speaker Search and Indexing for Multimedia DatabasesSpeaker Search and Indexing for Multimedia Databases
Speaker Search and Indexing for Multimedia Databases
 
Performance Analysis of Wireless Networks With MDQOS
Performance Analysis of Wireless Networks With MDQOSPerformance Analysis of Wireless Networks With MDQOS
Performance Analysis of Wireless Networks With MDQOS
 
B010410411
B010410411B010410411
B010410411
 
Opportunistic and playback sensitive scheduling for video streaming
Opportunistic and playback sensitive scheduling for video streamingOpportunistic and playback sensitive scheduling for video streaming
Opportunistic and playback sensitive scheduling for video streaming
 

Andere mochten auch

How to Find Winning Traders - Perspective from 2 Stock Market Experts
How to Find Winning Traders - Perspective from 2 Stock Market ExpertsHow to Find Winning Traders - Perspective from 2 Stock Market Experts
How to Find Winning Traders - Perspective from 2 Stock Market ExpertsChaikin Analytics
 
HTML Optimization by Kyle and Mike
HTML Optimization by Kyle and MikeHTML Optimization by Kyle and Mike
HTML Optimization by Kyle and MikeKyle Anderson
 
10winning strategies
10winning strategies10winning strategies
10winning strategiescsilv002
 
20 Ideas for your Website Homepage Content
20 Ideas for your Website Homepage Content20 Ideas for your Website Homepage Content
20 Ideas for your Website Homepage ContentBarry Feldman
 

Andere mochten auch (8)

How to Find Winning Traders - Perspective from 2 Stock Market Experts
How to Find Winning Traders - Perspective from 2 Stock Market ExpertsHow to Find Winning Traders - Perspective from 2 Stock Market Experts
How to Find Winning Traders - Perspective from 2 Stock Market Experts
 
Html5
Html5Html5
Html5
 
Talley1
Talley1Talley1
Talley1
 
Gov infoandyou fridayforum2011
Gov infoandyou fridayforum2011Gov infoandyou fridayforum2011
Gov infoandyou fridayforum2011
 
HTML Optimization by Kyle and Mike
HTML Optimization by Kyle and MikeHTML Optimization by Kyle and Mike
HTML Optimization by Kyle and Mike
 
American Community Survey
American Community SurveyAmerican Community Survey
American Community Survey
 
10winning strategies
10winning strategies10winning strategies
10winning strategies
 
20 Ideas for your Website Homepage Content
20 Ideas for your Website Homepage Content20 Ideas for your Website Homepage Content
20 Ideas for your Website Homepage Content
 

Ähnlich wie T bc(김은희)

WS98-08-008
WS98-08-008WS98-08-008
WS98-08-008Duco Das
 
Cccnc using content-based filtering in a system of recommendation in the co...
Cccnc   using content-based filtering in a system of recommendation in the co...Cccnc   using content-based filtering in a system of recommendation in the co...
Cccnc using content-based filtering in a system of recommendation in the co...Elaine Cecília Gatto
 
Video contents prior storing server for
Video contents prior storing server forVideo contents prior storing server for
Video contents prior storing server forIJCNCJournal
 
Using content-based filtering in a system of recommendation in the context of...
Using content-based filtering in a system of recommendation in the context of...Using content-based filtering in a system of recommendation in the context of...
Using content-based filtering in a system of recommendation in the context of...Elaine Cecília Gatto
 
Application of recommendation techniques for brazilian portable interactive d...
Application of recommendation techniques for brazilian portable interactive d...Application of recommendation techniques for brazilian portable interactive d...
Application of recommendation techniques for brazilian portable interactive d...Elaine Cecília Gatto
 
BIPODITVR: brazilian interactive portable digital tv recommendation system
BIPODITVR: brazilian interactive portable digital tv recommendation systemBIPODITVR: brazilian interactive portable digital tv recommendation system
BIPODITVR: brazilian interactive portable digital tv recommendation systemElaine Cecília Gatto
 
Iwssip application of recommendation techniques for brazilian portable inte...
Iwssip   application of recommendation techniques for brazilian portable inte...Iwssip   application of recommendation techniques for brazilian portable inte...
Iwssip application of recommendation techniques for brazilian portable inte...Elaine Cecília Gatto
 
QoE-enabled big video streaming for large-scale heterogeneous clients and net...
QoE-enabled big video streaming for large-scale heterogeneous clients and net...QoE-enabled big video streaming for large-scale heterogeneous clients and net...
QoE-enabled big video streaming for large-scale heterogeneous clients and net...redpel dot com
 
Sigap bi po-ditvr brazilian interactive portable digital tv recommendation ...
Sigap   bi po-ditvr brazilian interactive portable digital tv recommendation ...Sigap   bi po-ditvr brazilian interactive portable digital tv recommendation ...
Sigap bi po-ditvr brazilian interactive portable digital tv recommendation ...Elaine Cecília Gatto
 
Business Models for Web TV - Research Report
Business Models for Web TV - Research ReportBusiness Models for Web TV - Research Report
Business Models for Web TV - Research ReportAlessandro Masi
 
Qoe enhanced social live interactive streaming
Qoe enhanced social live interactive streamingQoe enhanced social live interactive streaming
Qoe enhanced social live interactive streamingeSAT Publishing House
 
All Aboard! A Cross-Agency Mission Toward Cross-Platform Audience-Based Video
All Aboard! A Cross-Agency Mission Toward Cross-Platform Audience-Based VideoAll Aboard! A Cross-Agency Mission Toward Cross-Platform Audience-Based Video
All Aboard! A Cross-Agency Mission Toward Cross-Platform Audience-Based VideoMediaPost
 
HbbTV 2.0 for LinkedTV: specification and gaps
HbbTV 2.0 for LinkedTV: specification and gapsHbbTV 2.0 for LinkedTV: specification and gaps
HbbTV 2.0 for LinkedTV: specification and gapsLinkedTV
 
The Optimization of IPTV Service Through SDN In A MEC Architecture, Respectiv...
The Optimization of IPTV Service Through SDN In A MEC Architecture, Respectiv...The Optimization of IPTV Service Through SDN In A MEC Architecture, Respectiv...
The Optimization of IPTV Service Through SDN In A MEC Architecture, Respectiv...CSCJournals
 
Multi-agent-TV-recommender-paper
Multi-agent-TV-recommender-paperMulti-agent-TV-recommender-paper
Multi-agent-TV-recommender-paperKaushal Kurapati
 
ReTV at EBU MDN Workshop 2020
ReTV at EBU MDN Workshop 2020ReTV at EBU MDN Workshop 2020
ReTV at EBU MDN Workshop 2020ReTV project
 
How Open Data Can Enhance Interactive Television
How Open Data Can Enhance Interactive TelevisionHow Open Data Can Enhance Interactive Television
How Open Data Can Enhance Interactive TelevisionLinkedTV
 
LinkedTV Deliverable 9.1.4 Annual Project Scientific Report (final)
LinkedTV Deliverable 9.1.4 Annual Project Scientific Report (final)LinkedTV Deliverable 9.1.4 Annual Project Scientific Report (final)
LinkedTV Deliverable 9.1.4 Annual Project Scientific Report (final)LinkedTV
 
Delay Efficient Method for Delivering IPTV Services
Delay Efficient Method for Delivering IPTV ServicesDelay Efficient Method for Delivering IPTV Services
Delay Efficient Method for Delivering IPTV ServicesIJERA Editor
 

Ähnlich wie T bc(김은희) (20)

WS98-08-008
WS98-08-008WS98-08-008
WS98-08-008
 
Cccnc using content-based filtering in a system of recommendation in the co...
Cccnc   using content-based filtering in a system of recommendation in the co...Cccnc   using content-based filtering in a system of recommendation in the co...
Cccnc using content-based filtering in a system of recommendation in the co...
 
Video contents prior storing server for
Video contents prior storing server forVideo contents prior storing server for
Video contents prior storing server for
 
Using content-based filtering in a system of recommendation in the context of...
Using content-based filtering in a system of recommendation in the context of...Using content-based filtering in a system of recommendation in the context of...
Using content-based filtering in a system of recommendation in the context of...
 
Application of recommendation techniques for brazilian portable interactive d...
Application of recommendation techniques for brazilian portable interactive d...Application of recommendation techniques for brazilian portable interactive d...
Application of recommendation techniques for brazilian portable interactive d...
 
Etv
EtvEtv
Etv
 
BIPODITVR: brazilian interactive portable digital tv recommendation system
BIPODITVR: brazilian interactive portable digital tv recommendation systemBIPODITVR: brazilian interactive portable digital tv recommendation system
BIPODITVR: brazilian interactive portable digital tv recommendation system
 
Iwssip application of recommendation techniques for brazilian portable inte...
Iwssip   application of recommendation techniques for brazilian portable inte...Iwssip   application of recommendation techniques for brazilian portable inte...
Iwssip application of recommendation techniques for brazilian portable inte...
 
QoE-enabled big video streaming for large-scale heterogeneous clients and net...
QoE-enabled big video streaming for large-scale heterogeneous clients and net...QoE-enabled big video streaming for large-scale heterogeneous clients and net...
QoE-enabled big video streaming for large-scale heterogeneous clients and net...
 
Sigap bi po-ditvr brazilian interactive portable digital tv recommendation ...
Sigap   bi po-ditvr brazilian interactive portable digital tv recommendation ...Sigap   bi po-ditvr brazilian interactive portable digital tv recommendation ...
Sigap bi po-ditvr brazilian interactive portable digital tv recommendation ...
 
Business Models for Web TV - Research Report
Business Models for Web TV - Research ReportBusiness Models for Web TV - Research Report
Business Models for Web TV - Research Report
 
Qoe enhanced social live interactive streaming
Qoe enhanced social live interactive streamingQoe enhanced social live interactive streaming
Qoe enhanced social live interactive streaming
 
All Aboard! A Cross-Agency Mission Toward Cross-Platform Audience-Based Video
All Aboard! A Cross-Agency Mission Toward Cross-Platform Audience-Based VideoAll Aboard! A Cross-Agency Mission Toward Cross-Platform Audience-Based Video
All Aboard! A Cross-Agency Mission Toward Cross-Platform Audience-Based Video
 
HbbTV 2.0 for LinkedTV: specification and gaps
HbbTV 2.0 for LinkedTV: specification and gapsHbbTV 2.0 for LinkedTV: specification and gaps
HbbTV 2.0 for LinkedTV: specification and gaps
 
The Optimization of IPTV Service Through SDN In A MEC Architecture, Respectiv...
The Optimization of IPTV Service Through SDN In A MEC Architecture, Respectiv...The Optimization of IPTV Service Through SDN In A MEC Architecture, Respectiv...
The Optimization of IPTV Service Through SDN In A MEC Architecture, Respectiv...
 
Multi-agent-TV-recommender-paper
Multi-agent-TV-recommender-paperMulti-agent-TV-recommender-paper
Multi-agent-TV-recommender-paper
 
ReTV at EBU MDN Workshop 2020
ReTV at EBU MDN Workshop 2020ReTV at EBU MDN Workshop 2020
ReTV at EBU MDN Workshop 2020
 
How Open Data Can Enhance Interactive Television
How Open Data Can Enhance Interactive TelevisionHow Open Data Can Enhance Interactive Television
How Open Data Can Enhance Interactive Television
 
LinkedTV Deliverable 9.1.4 Annual Project Scientific Report (final)
LinkedTV Deliverable 9.1.4 Annual Project Scientific Report (final)LinkedTV Deliverable 9.1.4 Annual Project Scientific Report (final)
LinkedTV Deliverable 9.1.4 Annual Project Scientific Report (final)
 
Delay Efficient Method for Delivering IPTV Services
Delay Efficient Method for Delivering IPTV ServicesDelay Efficient Method for Delivering IPTV Services
Delay Efficient Method for Delivering IPTV Services
 

Kürzlich hochgeladen

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Kürzlich hochgeladen (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

T bc(김은희)

  • 1. 674 IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011 An Automatic Recommendation Scheme of TV Program Contents for (IP)TV Personalization Eunhui Kim, Shinjee Pyo, Eunkyung Park, and Munchurl Kim, Member, IEEE Abstract—Due to the rapid increase of contents available under recommendation. For content searching, TV users usually input the convergence of broadcasting and Internet, efficient access to the query words into search engines via graphical user interface personally preferred contents has become an important issue. (GUI), and get the search results from which they finally select In this paper, an automatic recommendation scheme based on collaborative filtering is presented for intelligent personalization their preferred ones. The disadvantage of content searching is of (IP)TV services. The proposed scheme does not require TV that the users do not even know the keywords to search what viewers (users) to make explicit ratings on their watched TV they want. For content recommendation, it can be possible to program contents. Instead, it implicitly infers the users’ interests recommend to the users their preferred TV program contents. on the watched TV program contents. For the recommendation Content recommendation based approach can greatly alleviate of user preferred TV program contents, our proposed recom- mendation scheme first clusters TV users into similar groups user’s burden to access their preferred TV program contents. based on their preferences on the content genres from the user’s That is, the number of interactions to TV program GUI can be watching history of TV program contents. For the personalized reduced which is often the case in the traditional (IP)TV envi- recommendation of TV program contents to an active user, a ronments. We use the term “(IP)TV” as “IPTV and conventional candidate set of preferred TV program contents is obtained via TV” in this paper. collaborative filtering for the group to which the active user be- longs. The candidate TV programs for recommendation are then Collaborative filtering (CF) has often been used to recom- ranked by a proposed novel ranking model. Finally, a set of top- mend goods for e-commerce in Internet. The main idea of ranked TV program contents is recommended to the active user. CF techniques is based on item preferences of similar users The experimental results show that the proposed TV program [1]–[3]. However, the CF has the following characteristics: (1) recommendation scheme yields about 77% of average precision it is often designed to recommend to the active users (the target accuracy and 0.135 value of (Average Normalized Modified Retrieval Rank) with top five recommendations for 1,509 users for recommendation) the items which have not been people. purchased (consumed) before. This is not always appropriate Index Terms—Collaborative filtering, content based filtering, TV for TV viewers because they are often likely to watch the same personalization, TV program recommendation. TV program series such as drama, series, and news etc which are repeatedly broadcast; (2) high computational complexity is caused to deal with many users and items. I. INTRODUCTION The content based filtering (CBF) has been used in informa- tion filtering [1]–[3]. The CBF usually recommends items based on the previously evaluated description by the active users. The D UE to the convergence of broadcasting and internet, the number of TV program contents available at user sides is rapidly increasing and the accessibility to the TV program weak points of CBF are as follows: (1) its recommendation is restricted to the items that the active user has rated or consumed. contents becomes an important issue in TV watching environ- In other words, it is over-specialized recommendation which ments of traditional TV, IPTV or TV portals services. There- heavily depends on user’s consumption history on items; (2) it fore, it is important for users (TV viewers) to easily find and also requires heavy computational complexity for reliable rec- access their preferred ones from TV program contents available ommendation in analyzing various characteristics of the items to their terminals. There are two approaches to the provision that users have consumed [6]. of user’s preferred TV program contents; content searching and In this paper, we present a personalized automatic recom- mendation scheme of TV program contents based on CF. The Manuscript received August 31, 2010; revised February 11, 2011; accepted proposed recommendation scheme consists of three parts—user June 27, 2011. Date of publication August 08, 2011; date of current version profile reasoning, user clustering and recommendation of TV August 24, 2011. This work was supported by the R&D program of MKE/IITA program contents. Unlike the traditional CF-based item recom- (A1100-0801-3015, Development of Open-IPTV Technologies for Wired and Wireless Networks). mendation that requires explicit ratings on the purchased items E. Kim and E. Park are with the Dept. of Electrical Engineering at Korea Ad- by users, the proposed recommendation scheme implicitly vanced Institute of Science and Technology (KAIST), Daejeon 305-701, Korea learns the user’s interest on the TV program contents and (e-mail: lins77@kaist.ac.kr; epark@kaist.ac.kr). S. Pyo is with the Dept. of Information and Communications Engineering at genres, which does not require user’s explicit ratings on their KAIST, Daejeon 305-701, Korea (e-mail: sjpyo@kaist.ac.kr). watched TV program contents, thus making it more practical M. Kim is with the Dept. of Electrical Engineering at KAIST, Daejeon 305- in real TV watching environments. For user clustering, sim- 701, Korea (e-mail: mkim@ee.kaist.ac.kr). ilar user groups are clustered based on the feature vectors of Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. TV program genres computed from the usage history of the Digital Object Identifier 10.1109/TBC.2011.2161409 watched TV program contents by users. Two methods of user 0018-9316/$26.00 © 2011 IEEE
  • 2. KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION 675 clustering are compared in this paper: demographic clustering more frequently watched TV program contents by an active user and -means clustering. Finally, CF -based recommendation is are recommended in lower ranks. The proposed rank model in performed with a novel ranking model which extends the Best this paper tries to remedy this weak point of the previous rank Match (BM) model [7]–[9] to rank the candidate TV program models. contents for recommendation. The proposed rank model is In the CF systems, for a large number of users, the process of designed to make easy access to preferred TV program con- grouping similar users and recommending items entails a com- tents. For the recommendation of popular or newly broadcast putational complexity issue [1], [3]. To solve the system over- TV program contents, the popular TV program contents can load in clustering similar taste users for a large number of users, be identified via CF from similar user groups. On the other G.R. Xue suggests a two-step clustering method by using of- hand, newly broadcast TV program contents are preferably fline -means clustering and online PCC (Pearson correlation recommended by restricting not recently broadcast TV program coefficient) clustering for the item ratings [16]. In this method, contents outside a sliding time window in the history data of the -means clustering is performed offline for a large number watched TV program contents. of users just one time for a given value. Then more similar This paper is organized as follows: Section II reviews the pre- users are extracted online for active users based on PCC values vious related works for recommendation; Section III introduces from their respective clusters to predict the rating values for the the overall system architecture of our proposed automatic rec- unrated items. However, this method needs to know an appro- ommendation scheme for TV program contents and describes priate cluster number a priori. the data used for experiments; Section IV describes the compo- For user clustering in this paper, we compare two clustering nents of the proposed recommendation scheme in detail—user methods: demographic clustering and -means clustering. The profile reasoning, similar user clustering, and ranking of candi- demographic clustering is very simple to only use user’s demo- date TV program contents for recommendation; In Section V, graphic information such as genders and ages for clustering. For the proposed rank model is explained in detail; The experi- the -means clustering, we use the feature vectors of user pref- mental results are presented in Section VI; Finally, Section VII erence values of 8 genres and 47 subgenres for TV program con- concludes this work. tents. The former one is computationally very simple and can be a solution for the cold-start problem which takes time to learn II. RELATED WORKS users, but requires a priori knowledge about the demographic PTV [4] adopted a hybrid method of CF and CBF to sup- information. On the other hand, the latter one does not require plement the item ramp-up problem of CF and the user ramp-up such demographic information but implicitly clusters similar problem of CBF [1]. It requires users to provide their preference users based on the genre preference from the watched history information on contents while enrolling. Based on this prefer- data of TV program contents by users. For the -means clus- ence information provided by the users, it creates and manages tering, an appropriate number for can be found by searching user profiles with explicit ratings by users. However, in general a range based on dendrogram of hierarchical clustering [15], users do not want to offer their personal information or some- [17], [18]. We then determine a value based on the smallest times do not faithfully exhibit their interests on the items with sum of squared errors in this paper. The details of finding a right explicit ratings. Pazzani et al. reported that only 15% of people value are described in Section IV. And as a rank model for respond to the request for the relevance feedback on their pref- ordered recommendation of TV program contents, we propose erence [5]. Therefore, requiring users to rate explicitly on items a novel rank model based on the BM model [7]–[9]. The pro- is one of the main reasons that cause rating sparsity problems posed rank model is described in Section V in detail. [3]. We can summarize the contribution points of our personal- Deshpande M. et al. proposed an item based top- recom- ized automatic recommendation scheme for TV program con- mendation algorithm [11] and J. Wang et al. [12] proposed an tents as follows: (1) it is more appropriate for TV program rec- extension to “relevance model” from language model [13], both ommendation since it makes implicit reasoning for user prefer- of which utilize CF with user-profile and item matching for ence on TV program contents from the watched TV program item recommendation. The item based top- recommendation history data, which does not require users to explicitly rate their algorithm uses an item-to-item matrix for item recommenda- watched TV program contents; (2) it takes into account not only tion which is computed based on a user-to-item matrix [11] for the group preferences but also the individual user’s preferences which the recommendation performance is further improved by on TV program contents for recommendation; and (3) the pro- extending a language model to a relevance model [12]. posed rank model elaborates collaborative filtering by consid- In item recommendation of e-commerce, recommender sys- ering the relative lengths of watching times for TV program con- tems tend to suggest new items to the users because they are tents, not just by simply counting the number of users who have not likely to repurchase the same items or similar kinds after watched them. they have bought them. However, this may not be appropriate in TV environments where TV viewers are expected to watch III. ARCHITECTURE OF THE PROPOSED RECOMMENDATION (consume) the TV program contents (items) that they have been SYSTEM AND EXPERIMENT DATA accustomed to watch. In general, TV viewers tend to watch pop- ular TV program contents as their similar taste users do or spe- A. Proposed Recommendation System Scheme cific TV program contents according to their individual pref- In this paper, it is assumed that TV terminals are connected erences. So, the previous two models are insufficient in that the to the content servers of TV programs via back channels so that
  • 3. 676 IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011 TABLE I FIELDS OF TV USAGE HISTORY DATABASE data set of 2,005 people is used which has been collected on 6 terrestrial TV channels for 6 months from Dec. 1, 2002 to May 31, 2003. Table I shows the data fields of the usage history data set for watched TV program contents. The TV program contents in the history data set have 8 main genres which are further divided into 47 subgenres in total. For the data set, the total number of TV program titles amounts to 924 and the total number of subtitles is 1,855. We use 795 TV program Fig. 1. Architecture of the proposed recommendation system for TV program contents for training, corresponding to the first 4 months and content. 629 TV program contents for testing, corresponding to the last 2 months. Notice that the sum (1,424) of the TV program contents for training and testing exceeds the total number (924) the usage (or watching) history of (IP)TV program contents can of the titles because 500 watched TV program contents are be collected at the server sides. In IPTV environments, TV pro- the titles that were repeatedly broadcast. Table I shows the gram contents are streamed over IP networks and the respon- information attached to the broadcast TV program contents. sible content providers at head-end sides can collect usage his- tory of TV programs watched by the users via back channels. IV. PROPOSED RECOMMENDATION SCHEME Fig. 1 shows the architecture of our proposed automatic recom- mendation system for TV program contents. The automatic rec- A. User Profile Reasoning ommendation scheme consists of three agents: (1) the user pro- In this paper, a user is characterized in terms of his/her pro- file reasoning agent computes user preferences on genres and file which consists of two preferences on items (TV program TV programs by analyzing user’s watching history of TV pro- contents) and genres. First of all, we remove from the usage gram contents. So, this agent collects TV usage history from history data set all the TV program contents that have not been local repositories of TV terminals for user profile reasoning; (2) watched for more than 10% of their respective total lengths. The the user clustering agent clusters the users (TV viewers) into preference on a TV program content is defined as the relatively similar preference user groups; (3) the recommendation agent watched ratio over the total time length. For the reruns of TV recommends to each active user a list of his/her preferred TV program contents, they are all considered the same title (item). program contents. Here, an active user means the user who logs The preference on item by user is defined as into the TV terminal and is ready to receive a recommended TV program list. For recommendation, a list of candidate TV program contents (1) is extracted based on CF and our proposed rank model calculates the respective scores of the candidate TV program contents for where is the number of times being broadcast for an item . ranked recommendation. Then the TV program contents with And is defined as the top highest scores are presented to the active user in a descending rank order as the result of recommendation. Notice in this paper that the users and items are interchangeably used (2) with the TV viewers and TV program contents, respectively. where and indicate the watched time length for item by B. Description of Usage History Data Set for Watched TV user and the total length of an item , respectively. It must be Program Contents pointed out in (1) that the item preference might be inaccurately For the experiments to test the effectiveness of the proposed computed for inattentively watched TV program contents. The recommendation scheme, Neilson Korea’s TV usage history treatment of them is out of scope in this paper.
  • 4. KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION 677 Since the popularity or recency of TV program contents are TABLE II often diminished with time and the user’ interest on TV pro- SELECTION RESULTS OF CLUSTER NUMBERS, K gram contents varies over time, it is more appropriate to reflect the recently watched TV program contents for recommendation. Therefore, a time window function is defined as (3) where is a control parameter for the window size which is set and reveal the characteristics of cluster distributions, we take to two-month or four-month length in this paper. a two-step approach: an unsupervised hierarchical clustering is The average of user preference on item by user is given first run to construct a dendrogram for which a range of values by is found by cutting its branches at the large jumps in a distance criterion [14], [15], [17], [18]; the final value is then deter- (4) mined in the range by repeatedly performing -means clus- tering. To determine the final value, -means clustering is repeated times for which the centroids of clusters are where is the total number of items in the watched TV pro- randomly initialized each time. When the clustering yields the gram list by user . same clustering results times for a given value, the clus- In order to efficiently perform similar user clustering in low tering results become the final clusters with the value. When dimension, genre preference is used which can reflect the sim- any does not result in the same clustering results less than ilarity of user’s content consumption for TV program contents. times, the value that results in the same clustering re- Genre preference is computed by accumulating the item pref- sults the largest times is selected as the final value. In this paper, erence values for the genre and is then normalized by the total and are set to 1,000 and 5, respectively. Table II genre preference values for all genres. When the total number shows ranges and finally selected values for the features of genres is , the genre preference on genre by user vectors of 47 subgenres and 8 main genres for the watched TV is defined as program contents by the users who have watched at least 33% of the average number of watched TV program contents during (5) the training period. For this experiment, the open-source code Cluster 3.0 was used in [14]. The K-means clustering, which is the most time consuming task in our scheme, takes less than one minute for on a PC with Intel Core 2 Quad CPU 2.4 GHz and 2 GB B. User Clustering memory. For computational efficiency and effectiveness of collabora- tive filtering, TV users are clustered into similar user groups. C. Recommendation of TV Program Contents After clustering, each user has a membership to one of the user In order to recommend the preferred TV program contents groups. Therefore, CF operation for an active user is performed to an active user, the recommendation process consists of three for the user group to which the user belongs, not for the whole steps: extracting similar preference users from the clusters (sim- users. ilar user groups) to which the active user belongs; selecting For similar user grouping, two clustering approaches are candidate items for recommendation; and ranking the candidate compared: demographic clustering based on genders and ages, items. Especially the rank model will be explained in Section V and -means clustering based on the genre preference as in details. described in (5). For -means clustering, two feature vectors 1) Selecting Similar Preference Users of an Active User in a are compared with 8 preferences on the main genres and 47 Group: In Section IV-B, clustering the similar preference users preferences on the subgenres, respectively. is done offline. For recommendation, more relevant users are The demographic clustering is computationally very simple further extracted to construct a set of candidate TV program but can only be used if the demographic information such as contents based on CF for the user group to which an active user genders and ages is available. The demographic clustering can belongs. By doing so, the computation complexity is lowered by avoid the cold-start clustering problem that usually takes time reducing the number of all users to the number of similar peer while learning the users. In our demographic clustering, there users in the similar user group to which the active user belong. are 26 combinations of different genders and ages. The genders Based on the proximity measure, the most peer are divided into two classes—male and female and the ages are users with similar preference are extracted for the active user. divided into 13 classes— , and For the proximity measure, the normalized correlation is 66 ages and higher. computed by subtracting the average preference value from all As an alternative, -means clustering can be used, which the preference values [10]. does not require the demographic information. The essential On the basis of the consumed (watched) item (TV program prerequisite for -means clustering is to know an appropriate contents) list of an active user , the similarity between number as the number of clusters. In order to find a right and each peer user is measured as the proximity between
  • 5. 678 IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011 the normalized preference values on items for and in the two conditions: one is that the weight is independent of term similar preference groups. The similarity is defined as frequency; and the other is that the weight is linear with term frequency. Each condition is satisfied as and [9]. But, the second condition is not always satisfied as . To remedy this, a scaling factor is added in the numerator, thus resulting in . This is taken into account in the BM11, 15 and 25 models [7]. (6) The BM25 includes an inappropriate condition for TV pro- where is an item belonging to . represents the active gram recommendation since it gives a high weight on short doc- user’s profile and indicates the preference value on item uments compared to long documents by scope hypothesis [7]. consumed by user as in (1), and is the averaged item Therefore, the proposed rank model in this paper extends the preference value of user as in (4). BM15 model which does incorporate the scope hypothesis into The users with are only regarded as rel- its rank model so that the TV program contents that were broad- evant users to the active user. Then CF is performed for the cast less times are prevented from being higher-ranked. item lists between the active user and each of the relevant users. Since the number of similar preference users affects precision B. Proposed Rank Model performance, we need to find an optimal number of peer users based on the average precision accuracy, which is explained in An extension to the BM15 is made by taking into account Section VI. the collaborative filtering concept that accounts for the watching 2) Filtering Candidate Items With EPG Information: After times of users in the rank model for recommendation of TV pro- selecting the relevant users for an active user, their preference gram contents. Furthermore, we add to the rank model a weight items become the candidate items for recommendation. But with the correlation between candidate items for recommenda- some items may not be available in TV channels due to the tion and the items watched by the active user. We score the fil- termination of broadcasting for the TV program contents. In tered candidates of TV program contents in a ranked order. The case of linear TV broadcasting services, Electronic Program relations between candidate document and query in BM15 Guide (EPG) information can be used to filter out the candidate are translated into the relations between candidate TV program TV program contents which are not available. contents for recommendation and the active user . 3) Ranking Items: After a set of candidate TV program con- To make the BM15 be applicable for recommendation of TV tents for recommendation is determined, they are ordered by a program contents, we have the following assumptions: (1) the rank model. Finally, the recommended TV program contents are watched TV program list represents the active user ; (2) presented to active users in the descending order of rank scores. is transformed into the relative watching frequencies of both The proposed rank model is described in the following section. TV program contents of similar preference users and of an active user by applying CF concept, where indicates the watched TV program contents by ; (3) V. PROPOSED RANK MODEL is regarded as the relative watching frequency of by ; A. Related Work—BM Model (4) the similarity between and by is further taken into account. The matching score between and is defined as Our proposed ranked model extends the Best Match (BM) our proposed rank model by model [7]–[9]. The BM is a ranking function used by retrieval engines to rank matching documents according to their rele- vance to a given query. The BM model is given by (7) (9) where where and are used to balance the term frequency and the query term frequency in the rank model. The Robertson (8) et al. analyzed the way of weighting in details [7]. and are set to 200 and 0.2 empirically in this paper. In (9), indicates the relative watching frequency In (8), is the number of total documents, is the number of which is the ratio of the total number of watching times of both documents including a specific term of query, is the number of programs and over the total number of watching times of documents related with a specific topic, and is the number of the TV program contents (all ’s) by the peer users. documents including a specific term of query and is related with Therefore, the relative watching frequency is calculated as the specific topic [8]. In (7), is term frequency in documents and is term frequency in the query. The BM model originates from two Poisson models that the term frequency is independent of relevant and irrelevant (10) documents [9]. Based on this idea, the simple formation is suggested under the following
  • 6. KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION 679 items. There are two users, and , who have watched two items (TV program contents) and , and the similarity (VCC) value between and is 0.643. On the contrary, for the two users ( and ) who have watched both and , the VCC value between and is 0.4. So, if we set 0.5 of the VCC value as a threshold for the similarity between items, then the items and are considered being “similar”, but and are not sim- ilar. So, in (9) can improve the rank model by taking into account the relation between the active user and the candidate items for the score calculation. The effect of on precision performance will be shown in Fig. 5 in Section VI. VI. EXPERIMENTAL RESULTS For the usage history of watched TV program contents ex- plained in Section III-B, we use the usage history data of four months for training and the remaining two months for testing. Fig. 2. Illustration for significance on weights w . In this experiment section, we measure the performance of our recommendation scheme in terms of both precision/recall and Average Normalized Modified Retrieval Rank (ANMRR) In (9), indicates the ratio of the total number of which considers the rank orders in retrieval [19]–[21]. watching times of TV program contents over the total number of watching times for all the TV program contents (all ’s) by A. Performance Measure of Rank Models , and is given by Precision and Recall: The performance in information re- (11) trieval is usually measured in terms of precision and recall [22]. The precision is defined as the ratio of how many watched TV program contents (relevant documents) are contained in the rec- ommendation list (retrieved documents) of TV program con- Eq. (9) can be explained intuitively as follow: tents for an active user. The recall is defined as the ratio of is regarded as the peer users’ preferences on TV program how many recommended TV program contents (retrieved doc- contents in the same user group to which belongs; and uments) are actually included in the watched TV programs (rel- is referred to as the active user’s preference on evant documents) for the active user. The precision and recall TV program contents. The two terms and are defined as. are in mutually supplemental relation as being multiplied together. (14) In (9), . Two weights and are given as (15) (12) where is the number of watched TV pro- gram contents in the recommended list of TV program contents (13) and is the number of recommended TV program con- tents. is the number of recommended TV In (12), indicates the total number of broadcast times for all program contents in the watched list of TV program contents, items and is the number of broadcast times of each item. and is the number of watched TV program contents. reflects the inverse document frequency with independence For the recommendation of TV program contents, the precision assumption between the documents with and without the terms is a more appropriate metric for performance evaluation than the [8]. In this paper, it is assumed that the document for retrieval recall since the recall accuracy is increased as the number of rec- is and the specific term of query is from active user profile ommended TV program contents increases. In this regard, rec- . is added as a weight for the similarity between ommending a larger number of TV program contents increases and which is calculated as vector cosine correlation simi- false positives. So, in this paper, we use precision accuracy for larity in (13) for which the and are the feature performance evaluation. However, if the number of ground truth vectors of user preference on program and , respectively. increases, the precision also becomes higher. So, the perfor- This weight puts more emphasis on the active user’s personal mance of the rank models is measured in terms of precision and preference on TV program contents, which is not reflected in recall. the original BM [7], [8]. ANMRR: Compared to precision measure, another perfor- In order to see the effectiveness of in (13), Fig. 2 il- mance measure, ANMRR [19]–[21], is considered which has lustrates an example of similarity measures between two been developed to measure the image retrieval performance in
  • 7. 680 IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011 Fig. 3. Preferences on genres and channels for groups: Demographic Clustering (DM) vs. K -means clustering (KM). (a) Genre preferences of groups. (b) Channel preferences of groups. MPEG-7 [20]. The ANMRR indicates not only how many cor- A cluster ’s preference on a specific genre is computed by rect items are recommended but also how highly more relevant accumulating the preferences on the specific genre by all users items are ranked among the recommended items. For ANMRR, in the same cluster, and then it is normalized by the total number Normalized Modified Retrieval Rank (NMRR) is defined as of users in the cluster . Similarly, the normalized pref- erence on a channel can also be computed for each cluster. The preferences and on genre and channel for a cluster are calculated as (16) where is the number of recommended TV program con- (19) tents that the active user has really watched longer than the av- erage watching times of his/her preferred TV program contents during test period. is the allowable maximum rank and (20) is computed as where is the maximum of [21]. And the in (16) is revised by where is the total number of users in the cluster . and are the total numbers of genres and channels, respectively. (17) Fig. 3 shows the profiles of clusters’ preferences on genres and channels of TV programs. As shown in Fig. 3, the genre preferences are not significantly distinguished among different where is the rank ordered in score values by the pro- groups by demographic clustering (DM). On the other hand, the posed rank model in this paper. Finally ANMRR is written as groups by -means clustering (KM) show somewhat different follows: patterns for genre preferences among them. This is also sim- ilarly observed for the channel preferences except the group4 (18) and group5 by DM. Table III shows the average precision performance for dif- ferent numbers of groups by DM and KM. Although the pref- erences on genres and channels are better distinguished by KM B. Clustered Data Analysis than DM for different groups, the performance difference of av- As explained before, two clustering methods are compared erage precision between DM and KM is very slight. In this ex- between the demographic clustering and -means clustering. periment, the KM turns out to be effective for recommendation
  • 8. KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION 681 TABLE III TABLE V COMPARISONS OF AVERAGE PRECISION BETWEEN DM AND KM PRECISION ACCURACY WITH OUTLIER REMOVALS outlier criteria (33%); refer to Table V. the number of peer user is 5; refer to Fig. 8. the number of cluster is 26 by DM; refer to Table IV. the number of peer user is 5; refer to Fig. 8. TABLE IV AVERAGE PRECISION PERFORMANCE FOR THE NUMBER OF CLUSTERS outlier criteria (33%); refer to Table V. the number of peer user is 5; refer to Fig. 8. Fig. 5. Performance comparison with/without w . o’clock” and “Let’s marry”. For both TV program contents, there are relatively large numbers of users who have watched them less than 10% or more than 95% of the total TV program lengths, respectively. This pattern is similarly observed in other TV program contents. So, we set 10% of the total length of TV program contents as a threshold for outlier removal. Table V shows the average precision performance on different thresholds of outlier removal for the second case. With the ex- clusion of users who watched the TV program content less than the 33% of the average number of watched TV program con- Fig. 4. Number of users versus relative watching lengths of TV program contents. tents, we obtain 76.6% of average precision accuracy for the Top-5 recommendation. The threshold values in Table V indi- cate the ratios of the number of watched TV program contents although though it does not utilize the demographic information by each user over the average number of watched TV program for clustering. contents by all users during the training period of 4 months. The Table IV shows the average precision performance on Top- average numbers of watched TV program contents by all users recommendations for different numbers of clusters (groups) by are 124 and 90 during the training period of 4 months and the DM. Increasing the number of clusters does not enhance the testing period of 2 months, respectively. precision performance because we only use the most similar users to an active user of his/her group (The average D. Effect of on Precision Performance of the Proposed precision performance according to the number of peer users Rank Model will be shown in Fig. 8). Fig. 5 shows the performance comparison in terms of average precision accuracy for the recommended TV program contents C. Exclusion of Nosy Items and Outliers of Users with and without in (9). For the experiments, two kinds of outliers are removed to have The average precision accuracies with in the proposed reliable recommendation: firstly, the TV program contents that rank model are higher than those without it. The average preci- were watched less than 10 % of their respective whole lengths sion in this experiment is measured with 67 active users. are removed as noise; secondly, the users who have watched TV program contents less than a predefined TV watching times are E. Performance Comparison Between Proposed Rank Model excluded. and Linear Model Fig. 4 shows the number of users versus the relative watching For performance comparison in precision and ANMRR be- lengths for the two TV program contents—“Hometown at 6 tween the proposed rank model and the linear model [12], 90
  • 9. 682 IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011 Fig. 6. Performance comparison in precision and recall. Fig. 8. Precision accuracies versus different numbers of similar peer users. (a) Precision accuracies with Top-5 recommendations. (b) Precision accuracies with four clusters. Fig. 7. Performance comparison in ANMRR. 3) It takes into account the number of watching times for both TV program contents and by peer users in the score users are randomly selected. Figs. 6 and 7 show the perfor- calculation for ranking. This more elaborates collaborative mance comparisons of the proposed model and linear model in filtering. On the other hand, the linear rank model [12] precision-recall and ANMRR, respectively. The proposed rank simply counts the number of peer users who have watched model outperforms the linear model in both precision-recall and both and . ANMR. Notice that the smaller the ANMRR value is, the better F. Performance Analysis for Proposed Recommendation the recommendation performance is. Ideally, the case of Scheme is achieved when the ranked order of the rec- We investigate the performance of the proposed recommen- ommended TV program contents is perfectly matched with dation in terms of the number of clusters, the number of sim- the order of the watched TV program contents by the active ilar peer users and the number of TV program contents for final user during the test period. Therefore the recommended TV recommendation. Fig. 8 shows precision performance for dif- program contents by the proposed rank model are also better ferent numbers of similar peer users given Top-5 recommenda- matched in ranked orders than the linear model. tions and 4 clusters. The superiority of our proposed rank model comes from the In Fig. 8(a), the average precision performance slightly de- facts that: creases as the number of similar peer users increases for dif- 1) The proposed rank model defines the weight in (12) ferent numbers of clusters. This is because a smaller number such that more frequently broadcast TV program contents of similar peer users yields more correlation between the ac- are put in lower ranks; tive user and peer users so that the resulting recommendation 2) For recommendation of TV program contents, the tradi- precision is usually enhanced. When the number of clusters in- tional models usually intensify the preference of the peer creases, the resulting recommendation precision seldom varies users but relatively reduce the preference of an active user, for different numbers of recommended items, because the larger which might be appropriate to recommend unpurchased the number of clusters, the more correlate the clustered users items to active users in e-commerce environments. How- are. In Fig. 8(b), the average precision performance of the pro- ever, in TV environments, users often tend to watch the posed recommendation scheme becomes lowered as the number TV program contents that they used to watch. Therefore, it of recommended TV program contents increases. is reasonable to take into consideration the preferences of Table VI shows the 19 recommended TV program contents both similar users and an active user for recommendation. by the proposed rank model for the corresponding ground truth The proposed rank model actually considers both; items out of 67 for an active user with . As
  • 10. KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION 683 TABLE VI [2] R. Bruke, “Hybrid recommender systems: Survey and experiments,” RECOMMENDATION RESULTS AND GROUND TRUTH FOR AN ACTIVE USER User Modeling and User-Adapted Interaction, vol. 12, no. 4, pp. WITH ID = 213039903 331–370, Nov. 2002. [3] G. Adomavicius and A. Tuzhilin, “Toward the next generation of rec- ommender systems: A survey of the state-of-the-art and possible ex- tensions,” IEEE Trans. Knowl. Data Eng., vol. 17, no. 6, pp. 734–749, Jun. 2005. [4] P. Cotter and B. Smyth, “PTV: Intelligent personalized TV Guides,” Amer. Assoc. AI, pp. 957–964, 2000. [5] M. Pazzani and D. Billsus, “Learning and revising user profiles: The identification of interesting web sites,” Machine Learning, vol. 27, pp. 313–331, 1997. [6] N. Good, J. B. Schafer, J. A. Konstan, A. Borchers, B. Sarwar, J. Herlocker, and J. Riedl, “GroupLens research project, combining col- laborative filtering with personal agents for better recommendations,” Amer. Assoc. AI, 1999. [7] S. E. Robertson, S. Walker, M. Beaulieu, M. Gatford, and A. Payne, “Okapi at TREC-4,” in 4th Text Retrieval Conf. (TREC-4), 1995, pp. 73–96. [8] S. E. Robertson and K. Spark Jones, “Relevance weighting of search terms,” J. Amer. Soc. Inf. Sci., vol. 27, pp. 129–146, 1976. [9] S. E. Robertson and S. Walker, Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval. New York: Springer-Verlag, 1994, pp. 232–241. [10] P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl, “Grou- plens: An open aArchitecture for collaborative filtering of netnews,” in ACM Conf. Comput. Supported Cooperative Work, 1994, pp. 175–186. [11] M. Deshpande and G. Karvpis, “Item-based top-N recommendation algorithms,” ACM Trans. Inf. Syst., vol. 22, no. 1, pp. 143–177, Jan. 2004. [12] J. Wang, J. Powelse, J. Fokker, A. Vreies, and M. Reinders, “Person- alization on a peer-to-peer television system,” Multimedia Tools Appl., vol. 36, no. 1/2, pp. 89–103, 2007. # recommendation order, ## preference order of the active user. [13] J. Lafferty and C. Zhai, “Probabilistic relevance models based on doc- ument and query generation,” Language Modeling Inf. Retrieval, 2002. [14] M. J.L. De Hoon, S. Imoto, J. Nolan, and S. Miyano, “Open source clustering software,” Bioinfomatics, p. 781, 2004. aforementioned, the more frequently watched TV program con- K [15] D. P. Vetrov and L. I. Kuncheva, “Evaluation of stability of -means tents such as daily news, daily soap opera and weekly regular cluster ensembles with respect to random initialization,” IEEE Trans. PAMI, vol. 28, no. 11, pp. 1798–1808, 2006. drama are shown to appear higher-ranked. So this can help ac- [16] G. Xue, C. Lin, Q. Yang, W. Xi, H. Zeng, Y. Yu, and Z. Chen, “Scalable tive users easily to access their frequently watching TV program collaborative filtering using cluster-based smoothing,” in ACM SIGIR, Aug. 2005, pp. 114–121. contents. On the other hand, the low-ranked items by the pro- [17] T. Sergios and K. Konstantions, Pattern Recognition, 3rd ed. New posed rank model are the TV program contents that were not York: Academic Press, 2006, pp. 572–582. often or never watched by the active user but frequently watched [18] R. Duda, P. Hart, and D. Stork, Pattern Classification, 2nd ed. New York: Wiley-Interscience, 2001, pp. 542–559. by his/her peer users via the incorporation of collaborative fil- [19] B. S. Manjunath, J.-R. Ohm, V. V. Vasudevan, and A. Yamada, “Color tering into recommendation. and texture descriptors,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 6, pp. 703–715, Jun. 2001. [20] P. Ndjiki-Nya, J. Restat, T. Meiers, J.-R. Ohm, A. Seyferth, and R. Sniehotta, “Subjective evaluation of the MPEG-7 retrieval accuracy VII. CONCLUSION measure (ANMRR),” in ISO/WG11 MPEG Meeting, Geneva, Switzer- land, May 2000, Doc. M6029. In this paper, we propose an automatic recommendation [21] W. Ka-Man and P. Lai-Man, “MEPG-7 dominant color descriptor based relevance feedback usingmerged palette histogram,” in IEEE scheme of (IP)TV program contents for TV personalization. Int. Conf. Acoust., Speech, Signal Process., May 2004, vol. 3, pp. Unlike the tradition recommendation in document retrieval or 433–436. [22] C. D. Manning, P. Raghavan, and H. Schütze, Introduction to Informa- e-commerce, the proposed scheme does not require the explicit tion Retrieval. Cambridge, U.K.: Cambridge Univ. Press, 2008, pp. ratings on watched TV program contents, rather making im- 151–175. plicit reasoning for user preference on TV program contents in the usage history data of watched TV program contents. The rank model in the proposed scheme takes into account not only the group preferences but also the active user’s preferences on EunHui Kim received the B.E. degree in infor- TV program contents. Furthermore, the proposed rank model mation and communications engineering from elaborates collaborative filtering by considering the relative Chungnam National University in 2000 and the M.Sc. degree in information communications engi- lengths of watching times for TV program contents, not just by neering from Korea Advanced Institute and Science simply counting the number of users who have watched them. Technology (KAIST), Daejeon, Korea in 2009. She Our proposed recommendation scheme shows the effectiveness is currently pursuing the Ph.D. degree in Department of Electrical Engineering at KAIST. with rich experimental results for a real usage history dataset She worked for Samsung Electronics as an As- of watched TV program contents. sistant Engineer of Software team in Visual Display Division during 2000-2003 in Suwon, Korea and as REFERENCES an Associate Engineer of Architecture team in Digital Solution Center during 2003–2007 in Seoul Korea. Her research interests include personalization in [1] M. Montaner, B. Lopez, and J. L. DE Larosa, “A taxonomy of recom- connected TV, data clustering, collaborative filtering, and recommendation mender agents on the Internet,” AI Rev., vol. 19, pp. 285–330, 2003. modeling with AI for smart TV interaction.
  • 11. 684 IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011 Shinjee Pyo received the B.E. degree and the Munchurl Kim (M’07) received the B.E. degree in M.Sc. degree in information and communications electronics from Kyungpook National University, engineering from KAIST, Daejeon, Korea, in 2007 Korea in 1989, and the M.E. and Ph.D. degrees in and 2009, respectively. She is currently pursuing electrical and computer engineering from University the Ph.D. degree in information and communica- of Florida, Gainesville, Florida, in 1992 and 1996, tions engineering at KAIST. Her research interests respectively. include Personalization in Connected TV, sequential After his graduation, he joined Electronics and pattern mining for TV personalization and pattern Telecommunications Research Institute (ETRI) recognition. where he had led Broadcasting Media Research Team and Realistic Broadcasting Research Team, and had worked in the MPEG-4/7 standardization related research areas. In 2001, he joined, as Assistant Professor in School of Engineering, the Information and Communications University (ICU) in Taejon, Korea. Since 2009, he is Associate Professor in Department of Electrical Eunkyung Park received the B.E. degree in in- Engineering at KAIST, Daejeon, Korea. His research areas of interest include formation and communications engineering and 2D/3D video coding, 3D video quality assessment, pattern recognition and the M.Sc. in electrical engineering from KAIST, machine learning, and video analysis and understanding. Deajeon, Korea, in 2009 and 2011, respectively. Now she joins NAVER which is the first and largest search portal in Korea and is working with business and planning for web portal services. Her research in- terest is statistical learning theory, social networking, and machine learning.