Monday, March 9, 2009

experimental variables

Experimental techniques
From Wikipedia, the free encyclopedia
Jump to: navigation, search

Experimental research designs are used for the controlled testing of causal processes. The general procedure is one or more independent variables are manipulated to determine their effect on a dependent variable. These designs can be used where: 1) There is time priority in a causal relationship (cause precedes effect), 2) There is consistency in a causal relationship (a cause will always lead to the same effect), and 3) The magnitude of the correlation is great. The most common applications of these designs in marketing research and experimental economics are test markets and purchase labs. The techniques are commonly used in other social sciences including sociology and psychology.
Contents
[hide]

* 1 Controls
* 2 Purchase laboratory
* 3 Test markets
* 4 Experimental research designs
* 5 See also
* 6 Lists of related topics

[edit] Controls

One of the most important requirements of experimental research designs is the necessity of eliminating the effects of spurious, intervening, and antecedent variables. In the most basic model, cause (X) leads to effect (Y). But there could be a third variable (Z) that influences (Y), and X might not be the true cause at all. Z is said to be a spurious variable and must be controlled for. The same is true for intervening variables (a variable in between the supposed cause (X) and the effect (Y)), and anteceding variables (a variable prior to the supposed cause (X) that is the true cause). When a third variable is involved and has not been controlled for, the relation is said to be a [zero order] relationship. In most practical applications of experimental research designs there are several causes (X1,X2,X3). In most designs only one of these causes is manipulated at a time.

[edit] Purchase laboratory

A true experimental design requires an artificial environment so as to control for all spurious, intervening, and antecedent variables. A purchase laboratory approaches this ideal. Participants are given money, script, or credit to purchase products in a simulated store. Researchers modify one variable at a time (for example; price, packaging, shelf location, size, or competitors’ offerings) and determine what effect that has on sales volume. Internet-based purchase labs (called virtual purchase labs) are becoming more common.

Simplified versions of the purchase laboratory are often used for pragmatic reasons. An example of this would be to use tachistoscopes for testing packaging and shelf location.

[edit] Test markets

Quasi-experimental designs control some, but not all, of the extraneous factors. A test market is an example of this. A new product is typically introduced in a select number of cities. These cities must be representative of the overall national (or international) population. They should also be relatively unpolluted by outside influences (for example : media from other cities). The marketer has some control over the marketing mix variables, but almost no control over the broader business environment variables. Competitors could change their prices during the test. Government could change the level of taxes. New competing products could be introduced. An advertising campaign could be initiated by competitors. Any of these spurious variables could contaminate the test market.

[edit] Experimental research designs

In an attempt to control for extraneous factors, several experimental research designs have been developed, including:

* Classical pretest-post test - The total population of participants is randomly divided into two samples; the control sample, and the experimental sample. Only the experimental sample is exposed to the manipulated variable. The researcher compares the pretest results with the post test results for both samples. Any divergence between the two samples is assumed to be a result of the experiment.
* Solomon four group design - The population is randomly divided into four samples. Two of the groups are experimental samples. Two groups experience no experimental manipulation of variables. Two groups receive a pretest and a post test. Two groups receive only a post test. This is an improvement over the classical design because it controls for the effect of the pretest.
* Factorial design - this is similar to a classical design except additional samples are used. Each group is exposed to a different experimental manipulation.


Experimental Variables
To test our hypothesis about the benefits of personalization in the ADAPTIVE PLACE ADVISOR, we controlled two independent variables: the presence of user modeling and the number of times a user interacted with the system. First, because we anticipated that users might improve their interactions with the PLACE ADVISOR over time, we divided subjects into an experimental or modeling group and a control group. The 13 subjects in the modeling group interacted with a version of the system that updated its user model as described in Section 3.4. The 11 subjects in the control group interacted with a version that did not update the model, but that selected attributes and items from the default distribution described in Section 3.1. Naturally, the users were unaware of their assigned group. Second, since we predicted the system's interactions would improve over time, as it gained experience with each user, we observed its behavior at successive points along this ``learning curve.'' In particular, each subject interacted with the system for around 15 successive sessions. We tried to separate each subject's sessions by several hours, but this was not always possible. However, in general the subjects did use the system to actually help them decide where to eat either that same day or in the near future; we did not provide constraints other than telling them that the system only knew about restaurants in the Bay Area. To determine each version's efficiency at recommending items, we measured several conversational variables. One was the average number of interactions needed to find a restaurant accepted by the user. We defined an interaction as a cycle that started with the system providing a prompt and ended with the system's recognition of the user's utterance in response, even if that response did not answer the question posed by the prompt. We also measured the time taken for each conversation. This began when a ``start transaction'' button was pushed and ended when the system printed ``Done'' (after the user accepted an item or quit). We also collected two statistics that should not have depended on whether user modeling was in effect. First was the number of system rejections, that is, the number of times that the system either did not obtain a recognition result or that its confidence was too low. In either case the system asked the user to repeat himself. Since this is a measure of recognition quality and not the effects of personalization, we omitted it from the count of interactions. A second, more serious problem was a speech misrecognition error in which the system assigned an utterance a different meaning than the user intended. Effectiveness, and thus the subjective quality of the results, was somewhat more difficult to quantify. We wanted to know each user's degree of satisfaction with the system's behavior. One such indication was the rejection rate: the proportion of attributes about which the system asked but the subject did not care ( REJECT's in ATTEMPT-CONSTRAIN situations). A second measure was the hit rate: the percentage of conversations in which the first item presented was acceptable to the user. Finally, we also administered a questionnaire to users after the study to get more subjective evaluations.

No comments: