SlideShare ist ein Scribd-Unternehmen logo
1 von 5
Downloaden Sie, um offline zu lesen
A Pointlogic White Paper
Data Fusion:
Combining Multiple Analysis

Susanne Hartog-Buijtenhek




                            enabling smart decisions
                                     www.pointlogic.com
2                                                                 Data Fusion

Preface
Nowadays, a lot of money is spent on advertisement on a yearly basis. For
advertisers it is important to know what the pay-off of their advertisement
will be. Therefore, it is important to know how many people will see the
advertisement (or: how many people will be reached). Several respondent
researches are available to fulfill this need for information. For example, the
reach of magazines and newspapers is measured by print researches. In a
print research, a so-called ‘reading probability’ is available for every
respondent. This ‘reading probability’ serves as an indicator in computing
the reach.

In contrast, the reach of websites is measured by an internet research that
tracks the behavior of internet respondents. The results are used to
compute the probability that respondents visit a certain website in a certain
period. The resulting data is published by independent agencies and serves
as the currency in the market.

Advertisers show an increasing demand for combined reach figures. This is
a result from the increased use of several media in a single advertisement
campaign. Moreover, publishers of print media often have an accompanying
website. Hence, the question is: who is reached by both an advertisement
in a magazine/paper as well as an advertisement on the internet?

Data fusion
A combined research, with information on both print reach and internet
reach, could be created by setting up a research that contains information
about print reach as well as internet reach. Nevertheless, this is not cost-
efficient. An alternative method is to complement the print research with
information about the internet reach. This is done by a mathematical
technique that uses overlapping information, i.e. information from both
analyses. This technique is called Data fusion.

Data fusion combines the information of two analyses by using overlapping
information. We used one of the data fusion techniques for generating
combined print and internet data. This data fusion technique is related to a
well-known statistical technique named “Imputation”. Imputation is used for
complementing data in a dataset. Basically, the print research can be seen
as a research that misses some data.

The data fusion method consists of two sequential steps. At first,
econometric models need to be estimated, based on the respondent data of
the internet research. Secondly, these models need to be applied on the
respondents of the print research. Since the dataset contains a large
amount of websites, both steps are accomplished fully automatically.




                                                 enabling smart decisions
                                                            www.pointlogic.com
3                                                                Data Fusion

The estimation of models
The first step in data fusion is estimating econometric models that explain
the internet behavior of respondents in the internet analysis.

The estimation of models based on internet data is not really a
straightforward process. The data of the reach of websites namely have a
specific character. For most of the websites, respondents have a reading
probability equal to zero. Regarding the people with a reading probability
greater than zero, a substantial amount still has a probability almost equal
to zero (or: very small). This structure impedes the use of a standard
regression model and hence a more sophisticated model has to be chosen.
The overlapping information can be used for the explanatory variables.
However, there is also a high mutual correlation between the visiting
probabilities and the websites. If only the overlapping information is taken
into account, the correlation between the websites is ignored. By taking the
websites into account as an explanatory variable, the correlation between
the websites can be included. The challenge in this method lies in the
application of the models. When applying a model for a website, based on
the respondent data of a print research, it is still lacking information
concerning other websites. The answer to this problem is an iterative
technique named Gibbs’ Sampler, which will be discussed later on.

The models need to be estimated for over 300 websites, each having
different   characteristics.  Considering     the   extensive amount   of
characteristics, this will not be done manually. Therefore, we have
developed a self-evident estimation procedure, which makes a selection of
interesting explanatory variables per website, based on the underlying
correlation and the underlying partial correlation.

Applying models
After the estimation of the models, the models have to be applied. Before
actually applying the models, a starting-value is created for every
respondent for every website. This starting-value forms the basic principle
for the Gibbs’ Sampler implementation method.

The models of the internet research have other websites as explanatory
variables too. By initiating a starting-value, websites can be used as
explanatory variables in the implementation process. Subsequently, the
implementation of the models occurs iteratively. The initialized starting-
value changes during every iteration, which is being carried through in the
model. For convergence issues, it is important that the amount of websites
included in the model is limited.

By applying the models, one could choose to impute the expected value per
respondent. But, in order to maintain the variance in visiting probabilities, a
better alternative is the imputation of a-select drawings from the probability
distribution for each respondent based on the models.
                                                enabling smart decisions
                                                            www.pointlogic.com
4                                                               Data Fusion

Results
The results of the data fusion technique have been extensively validated.
This was possible, since a section of respondents were present in both
researches1. The presence of both the true values and the imputed values
for these respondents generates an unbiased way to validate the model’s
results, which were remarkably positive.

Qualitative validations are at least as important as quantitative validations.
From a mathematical point of view, the results contain the average reach as
well as the variance of the unbiased estimators. The comparison of the
overlapping respondents provides another validation for the results of the
used method. Unfortunately, this does not automatically mean the analysis
is accepted by market. Other validations are necessary for common
acceptance.

Two executed validations are the judgment of the model’s used significant
explanatory variables and the final combined scope data. The used variables
simply need to be ‘logical’.

However, the most important thing is whether the final overlap is being
recognized and experienced as logical by the publishers. Some publishers
strive to make the overlap as small as possible and therefore attract a
different public. Others have the goal to have the overlap as large as
possible, which is realized by, for example, placing a reference to a website
in a magazine. If the final overlap is recognized, is an important part of the
acceptance and hence the validation.

Future
To conclude, the column-wise data fusion method provides very good
results for combined print-internet reach. This method will be, based on
principles of new print and internet analyses, used semi-annually in order to
generate a combined analysis file.

These results make it obviously desirable to test the methodology on other
analyses. By doing this, the method can quite easy be extended by new
model formulations, which can then be used in determining combined reach
with other media.




1
    Both analyses come from the same agency.
                                                enabling smart decisions
                                                           www.pointlogic.com
5                                                             Data Fusion

    About Pointlogic | enabling smart decisions
    Founded in 1992 by Peter Kloprogge and Sjoerd Mostert - with offices
    in New York, London, Frankfurt, Sydney, Amsterdam, and Rotterdam
    - Pointlogic combines cutting-edge research, advanced mathematical
    modeling, and flexible software tools to enable our clients to make
    smart decisions.

    Pointlogic works together with clients, applying fresh, analytical
    thinking to problems. We then use powerful mathematical modeling
    to generate insight into clients’ choices. And then, most importantly,
    we deliver concrete, software-based solutions that clients can both
    implement and distribute across internal and partner networks.

    For more information about any of Pointlogic’s products or for
    press inquiries please contact Nicole Alexander:

    Office: 212-683-2330
    E-Mail: alexander@pointlogic.com




                                             enabling smart decisions
                                                         www.pointlogic.com

Weitere ähnliche Inhalte

Andere mochten auch

Telematics update Munich 2010
Telematics update Munich 2010Telematics update Munich 2010
Telematics update Munich 2010Cbeccari
 
ATL PFFC Article
ATL PFFC ArticleATL PFFC Article
ATL PFFC ArticleATL
 
Ppt 20100109
Ppt 20100109Ppt 20100109
Ppt 20100109Tom
 
круги эйлера
круги эйлеракруги эйлера
круги эйлераdmitrieva
 
Sherry Ryan Transportation Systems Presentation
Sherry Ryan  Transportation Systems PresentationSherry Ryan  Transportation Systems Presentation
Sherry Ryan Transportation Systems Presentationguest1356e0
 
Alison Corporate Presentation March 2010(3)
Alison Corporate Presentation   March 2010(3)Alison Corporate Presentation   March 2010(3)
Alison Corporate Presentation March 2010(3)Kevin O'Malley
 
Av Map Projectsfor Automotive B2 B
Av Map Projectsfor Automotive B2 BAv Map Projectsfor Automotive B2 B
Av Map Projectsfor Automotive B2 BCbeccari
 

Andere mochten auch (9)

Telematics update Munich 2010
Telematics update Munich 2010Telematics update Munich 2010
Telematics update Munich 2010
 
ATL PFFC Article
ATL PFFC ArticleATL PFFC Article
ATL PFFC Article
 
Analisis Estadisticos De 8 Grado
Analisis Estadisticos De 8 GradoAnalisis Estadisticos De 8 Grado
Analisis Estadisticos De 8 Grado
 
Ppt 20100109
Ppt 20100109Ppt 20100109
Ppt 20100109
 
круги эйлера
круги эйлеракруги эйлера
круги эйлера
 
Sherry Ryan Transportation Systems Presentation
Sherry Ryan  Transportation Systems PresentationSherry Ryan  Transportation Systems Presentation
Sherry Ryan Transportation Systems Presentation
 
Alison Corporate Presentation March 2010(3)
Alison Corporate Presentation   March 2010(3)Alison Corporate Presentation   March 2010(3)
Alison Corporate Presentation March 2010(3)
 
Queen Isabella 1
Queen Isabella 1Queen Isabella 1
Queen Isabella 1
 
Av Map Projectsfor Automotive B2 B
Av Map Projectsfor Automotive B2 BAv Map Projectsfor Automotive B2 B
Av Map Projectsfor Automotive B2 B
 

Kürzlich hochgeladen

Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxpriyanshujha201
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxAndy Lambert
 
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyThe Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyEthan lee
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with CultureSeta Wicaksana
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLSeo
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxWorkforce Group
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...Aggregage
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Centuryrwgiffor
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...rajveerescorts2022
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Roland Driesen
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...Paul Menig
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayNZSG
 

Kürzlich hochgeladen (20)

Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptx
 
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyThe Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptx
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 

Pointlogic Analysis Data Fusion

  • 1. A Pointlogic White Paper Data Fusion: Combining Multiple Analysis Susanne Hartog-Buijtenhek enabling smart decisions www.pointlogic.com
  • 2. 2 Data Fusion Preface Nowadays, a lot of money is spent on advertisement on a yearly basis. For advertisers it is important to know what the pay-off of their advertisement will be. Therefore, it is important to know how many people will see the advertisement (or: how many people will be reached). Several respondent researches are available to fulfill this need for information. For example, the reach of magazines and newspapers is measured by print researches. In a print research, a so-called ‘reading probability’ is available for every respondent. This ‘reading probability’ serves as an indicator in computing the reach. In contrast, the reach of websites is measured by an internet research that tracks the behavior of internet respondents. The results are used to compute the probability that respondents visit a certain website in a certain period. The resulting data is published by independent agencies and serves as the currency in the market. Advertisers show an increasing demand for combined reach figures. This is a result from the increased use of several media in a single advertisement campaign. Moreover, publishers of print media often have an accompanying website. Hence, the question is: who is reached by both an advertisement in a magazine/paper as well as an advertisement on the internet? Data fusion A combined research, with information on both print reach and internet reach, could be created by setting up a research that contains information about print reach as well as internet reach. Nevertheless, this is not cost- efficient. An alternative method is to complement the print research with information about the internet reach. This is done by a mathematical technique that uses overlapping information, i.e. information from both analyses. This technique is called Data fusion. Data fusion combines the information of two analyses by using overlapping information. We used one of the data fusion techniques for generating combined print and internet data. This data fusion technique is related to a well-known statistical technique named “Imputation”. Imputation is used for complementing data in a dataset. Basically, the print research can be seen as a research that misses some data. The data fusion method consists of two sequential steps. At first, econometric models need to be estimated, based on the respondent data of the internet research. Secondly, these models need to be applied on the respondents of the print research. Since the dataset contains a large amount of websites, both steps are accomplished fully automatically. enabling smart decisions www.pointlogic.com
  • 3. 3 Data Fusion The estimation of models The first step in data fusion is estimating econometric models that explain the internet behavior of respondents in the internet analysis. The estimation of models based on internet data is not really a straightforward process. The data of the reach of websites namely have a specific character. For most of the websites, respondents have a reading probability equal to zero. Regarding the people with a reading probability greater than zero, a substantial amount still has a probability almost equal to zero (or: very small). This structure impedes the use of a standard regression model and hence a more sophisticated model has to be chosen. The overlapping information can be used for the explanatory variables. However, there is also a high mutual correlation between the visiting probabilities and the websites. If only the overlapping information is taken into account, the correlation between the websites is ignored. By taking the websites into account as an explanatory variable, the correlation between the websites can be included. The challenge in this method lies in the application of the models. When applying a model for a website, based on the respondent data of a print research, it is still lacking information concerning other websites. The answer to this problem is an iterative technique named Gibbs’ Sampler, which will be discussed later on. The models need to be estimated for over 300 websites, each having different characteristics. Considering the extensive amount of characteristics, this will not be done manually. Therefore, we have developed a self-evident estimation procedure, which makes a selection of interesting explanatory variables per website, based on the underlying correlation and the underlying partial correlation. Applying models After the estimation of the models, the models have to be applied. Before actually applying the models, a starting-value is created for every respondent for every website. This starting-value forms the basic principle for the Gibbs’ Sampler implementation method. The models of the internet research have other websites as explanatory variables too. By initiating a starting-value, websites can be used as explanatory variables in the implementation process. Subsequently, the implementation of the models occurs iteratively. The initialized starting- value changes during every iteration, which is being carried through in the model. For convergence issues, it is important that the amount of websites included in the model is limited. By applying the models, one could choose to impute the expected value per respondent. But, in order to maintain the variance in visiting probabilities, a better alternative is the imputation of a-select drawings from the probability distribution for each respondent based on the models. enabling smart decisions www.pointlogic.com
  • 4. 4 Data Fusion Results The results of the data fusion technique have been extensively validated. This was possible, since a section of respondents were present in both researches1. The presence of both the true values and the imputed values for these respondents generates an unbiased way to validate the model’s results, which were remarkably positive. Qualitative validations are at least as important as quantitative validations. From a mathematical point of view, the results contain the average reach as well as the variance of the unbiased estimators. The comparison of the overlapping respondents provides another validation for the results of the used method. Unfortunately, this does not automatically mean the analysis is accepted by market. Other validations are necessary for common acceptance. Two executed validations are the judgment of the model’s used significant explanatory variables and the final combined scope data. The used variables simply need to be ‘logical’. However, the most important thing is whether the final overlap is being recognized and experienced as logical by the publishers. Some publishers strive to make the overlap as small as possible and therefore attract a different public. Others have the goal to have the overlap as large as possible, which is realized by, for example, placing a reference to a website in a magazine. If the final overlap is recognized, is an important part of the acceptance and hence the validation. Future To conclude, the column-wise data fusion method provides very good results for combined print-internet reach. This method will be, based on principles of new print and internet analyses, used semi-annually in order to generate a combined analysis file. These results make it obviously desirable to test the methodology on other analyses. By doing this, the method can quite easy be extended by new model formulations, which can then be used in determining combined reach with other media. 1 Both analyses come from the same agency. enabling smart decisions www.pointlogic.com
  • 5. 5 Data Fusion About Pointlogic | enabling smart decisions Founded in 1992 by Peter Kloprogge and Sjoerd Mostert - with offices in New York, London, Frankfurt, Sydney, Amsterdam, and Rotterdam - Pointlogic combines cutting-edge research, advanced mathematical modeling, and flexible software tools to enable our clients to make smart decisions. Pointlogic works together with clients, applying fresh, analytical thinking to problems. We then use powerful mathematical modeling to generate insight into clients’ choices. And then, most importantly, we deliver concrete, software-based solutions that clients can both implement and distribute across internal and partner networks. For more information about any of Pointlogic’s products or for press inquiries please contact Nicole Alexander: Office: 212-683-2330 E-Mail: alexander@pointlogic.com enabling smart decisions www.pointlogic.com