After reading this week's material, consider, research and address the following: As you know, the residential and commercial real estate markets in many areas of the United States increased exponenti

JOURNAL OF HOUSING RESEARCHVOLUME 23ISSUE 2 J H R Residential Real Estate Appraisal Bias in the Absence of Client Feedback Julia Freybote, Alan J. Ziobrowski, and Paul Gallimore AbstractAbstract Client and transaction price feedback, which implicitly includes client feedback, have been found to introduce an upward bias in appraisal judgments. However, new legislation such as the Dodd-Frank Act eliminates client in uence on residential appraisers by introducing appraisal management companies as intermediaries between appraisers and lenders. In this study, we investigate whether the transaction price feedback-induced bias persists in the absence of client (lender) feedback. Using experimental design and residential expert appraisers, we nd that the biasing effect of transaction price feedback on appraisal judgments has been eliminated. This indicates the effectiveness of the new legislation in reducing lender-induced residential appraisal bias.

Appraisers receive feedback from a number of sources, such as other appraisers, clients, and the real estate market. The impact of client feedback on appraisal judgments has received a considerable amount of attention in the behavioral real estate literature.

Researchers have found that client feedback provided by lenders and individual sellers biases residential and commercial appraisers upwardly (Kinnard, Lenk, and Worzala, 1997; Hansz, 2004; Diaz and Hansz, 2010; Zhu and Pace, 2012). Hansz and Diaz (2001) investigate market feedback, de ned as transaction price feedback, and nd that it also biases appraisal judgments upwardly. The authors offer client feedback, implied in transaction price feedback, as the main explanation for their ndings.

The client feedback-induced appraisal bias represents a problem for other real estate market participants who require objective market value estimates for their decision making. As the biasing effect of client feedback threatens the objectiveness and integrity of the residential real estate valuation process, the Home Valuation Code of Conduct (HVCC) and its successor, the ‘‘appraisal independence standards’’ in the Dodd-Frank Act, were passed in 2009 and 2010, respectively [for more information about the HVCC, see Abernethy and Hollans (2010)]. The objective of this legislation is to ensure the independence of residential appraisers from lenders in order to protect borrowers and avoid biased value judgments. The Dodd-Frank Act requires lenders originating residential mortgages and selling them to Fannie Mae or Freddie Mac to use an appraisal management company (AMC) to obtain an appraisal. In particular, a lender gives an appraisal assignment to an AMC, which has a pool of residential appraisers. The AMC then selects an appraiser to complete the assignment. This new legislation fundamentally changes the relationship of residential real estate appraisers and their most important client group, lenders. While traditionally lenders and appraisers were in direct contact, the Dodd-Frank Act disconnects them and eliminates direct client feedback. 128JULIA FREYBOTE, ALAN J. ZIOBROWSKI, AND PAUL GALLIMORE If new legislation truly disconnects appraisers and lenders, we would expect client feedback to have no signi cant impact on residential appraisal judgments. As principal- agent effects are dif cult to model in a laboratory experiment, this investigation focuses on transaction price (market) feedback as de ned by Hansz and Diaz (2001), which implicitly includes client feedback.

This study revisits the ndings of Hansz and Diaz (2001) and extends them to residential appraisers in a fundamentally changed residential appraisal industry. We contribute to the real estate literature on appraisal bias in the following ways. Compared to the anchoring and adjustment bias, the impact of feedback on appraisal judgments has received little attention in the experimental behavioral real estate literature, which itself is a relatively new eld of real estate research. This study adds to the scarce literature on feedback and appraisal bias, particularly with regard to residential real estate valuation.

The post-2007 changes to the residential appraisal task environment, such as new legislation, the rise of AMCs, new licensing requirements, and increased litigations, represent an excellent background for this study. Our investigation is valuable as it allows conclusions about the persistence of the previously identi ed asymmetric appraisal bias in this changed environment. Such knowledge is valuable in assessing the reliability of residential value judgments, the effectiveness of new legislation such as the Dodd- Frank Act and, if a bias is present, in formulating policy recommendations necessary to mitigate it. Literature ReviewLiterature Review Unlike the anchoring-and-adjustment heuristic bias (e.g., Wolverton, 1996; Diaz and Hansz, 1997; Diaz, 1997; Gallimore and Wolverton, 1997; Diaz and Hansz, 2001; Tidwell and Gallimore, 2014), feedback has received limited attention in the behavioral real estate literature. The majority of feedback studies focus on the effect of client feedback on appraisal judgments.

A number of researchers have investigated the motivations of clients to in uence appraisers (e.g., Levy and Schuck, 1999, 2004; Wolverton and Gallimore, 1999). The main motivation for lenders, mortgage originators, sellers, and brokers to in uence appraisers is to close a deal, particularly in hot real estate markets such as the pre-2007 housing market in the United States. In residential mortgage lending, if an appraisal estimate is below the negotiated sales price, the mortgage deal may fail and negatively affect the compensation of transaction parties (e.g., brokers or mortgage originators) (Cho and Megbolugbe, 1996; Chinloy, Cho, and Megbolugbe, 1997). By exerting in uence on appraisers to arrive at a certain value, lenders and mortgage originators can pressure appraisers to deliver client-favorable value estimates. Levy and Schuck (2004) provide additional incentives of clients to in uence appraisers, such as ensuring realistic estimates and market credibility, maximizing asset-based fees or validating in-house valuations.

Wolverton and Gallimore (1999) investigate how different forms of client feedback affect residential and commercial appraisers. Client feedback can be distinguished in environmental perception feedback (e.g., client asks to consider other comparables), coercive feedback (e.g., client pressures appraiser into increasing the estimate by RESIDENTIAL REAL ESTATE APPRAISAL BIAS129 threatening to send less work and / or remove appraiser from list of acceptable suppliers), and positive reinforcement feedback (e.g., client does not discuss value judgment, is grateful, and sends more work.). The authors nd that the form of feedback provided has an impact on whether appraisers perceive their role as price validators (environmental perception and coercive feedback) or as provider of an objective market value (positive reinforcement feedback). The type of client feedback thus affects whether appraisers behave normatively (i.e., objectively and unbiased). Levy and Schuck (1999, 2004) discuss additional methods of client in uence on appraisals, for example, exerting expert, information, reward, coercive, and procedural power.

A number of researchers have found that client feedback introduces an upward bias into appraisal judgments. Cho and Megbolugbe (1996) nd evidence of this upward appraisal bias: 80% of all residential valuations reviewed in their study are equal to or higher than the transaction price. They argue that this bias is likely to result from a strong interest of all parties involved in mortgage lending to arrive at higher value estimates. Hansz (2004) and Diaz and Hansz (2010) investigate the impact of client feedback on commercial and residential appraisal judgments, respectively. Using the pending mortgage amount as a form of client feedback, Hansz (2004) nds that commercial appraisers with this knowledge make signi cantly higher value judgments than appraisers without this information. Diaz and Hansz (2010) show that feedback from individual sellers also in uences appraisers. Using residential appraisers and a different methodology than Hansz (2004), the authors nd that client feedback leads to an upward bias in residential appraisal judgments. Kinnard, Lenk, and Worzala (1997) nd that client size, but not the size of the requested adjustment, in uences the behavior of an appraiser. The more the appraiser’s business depends on a particular client, the more likely is the appraiser to revise value judgments to meet adjustment requests by this client. The ndings of Zhu and Pace (2012) are in line with previous studies. The authors nd that residential appraisers employed by clients (e.g., lenders), as opposed to courts, provide value judgments favorable to their clients. In conclusion, although commercial and residential appraisers differ, both types of appraisers are susceptible to client feedback, resulting in an upward bias in appraisal judgments.

Hansz and Diaz (2001) focus on market feedback and nd that transaction price feedback also upwardly biases valuation judgments. Transaction price feedback represents simple outcome feedback. This type of feedback can be de ned as ‘‘information about the realization of a previously predicted event’’ (O ¨ nkal and Muradoglu, 1995). Simple outcome feedback (transaction price feedback) provides information about the correctness of a judgment (Leung and Trotman, 2008). In their experiment, the authors asked expert commercial appraisers to value a plot of vacant industrial land. Once appraisers provided a value estimate, they received feedback that their value estimate was either too low or too high compared to the realized sales price. A control group received no feedback. Appraisers were then asked to value a second unrelated vacant plot.

Experimental subjects who received the feedback that their rst value estimate was too low, compared to the sales price, made signi cantly higher value judgments on the subsequent unrelated property. The ‘‘too high’’ feedback had no effect on value judgments. The authors explain their ndings with appraiser-client dynamics. Mortgage lenders represented the most important client group [56.6%; Hansz (1999)] of 130JULIA FREYBOTE, ALAN J. ZIOBROWSKI, AND PAUL GALLIMORE experimental subjects used in Hansz and Diaz (2001). When the study was conducted in 1999, appraisers were used to clients (lenders) with a strong interest in value estimates at the upper end of the justi able range. These clients were likely to request an upward adjustment if a value estimate was below the pending mortgage amount, signaling appraisers that their estimates were too low. Consequently, appraisers in the study may have subconsciously responded to the ‘‘too low’’ feedback provided in the experiment.

This biasing effect of client feedback is in line with the ndings of other studies investigating appraisal bias in the mortgage lending process (Cho and Megbolugbe, 1997; Hansz, 2004).

The introduction of AMCs as intermediaries between lenders and residential appraisers in line with the Dodd-Frank Act eliminates direct client feedback. As client feedback is the main explanation for the effect of transaction price feedback on appraisal judgments (Hansz and Diaz, 2001), we expect the absence of direct client feedback to eliminate this effect. Thus, residential appraisers are expected to be unbiased and objective. We explore the following hypotheses.

Hypothesis 1: Compared to a control group receiving no feedback, residential expert appraisers receiving feedback that their previous value estimates were ‘‘too low’’ with regard to the realized transaction price do not make higher subsequent value judgments on an unrelated property.

Hypothesis 2: Compared to a control group receiving no feedback, residential expert appraisers receiving feedback that their previous value estimates were ‘‘too high’’ with regard to the realized transaction price do not make lower subsequent value judgments on an unrelated property. Data and MethodologyData and Methodology We test our hypotheses in a controlled (laboratory) experiment using a pre-posttest experimental design. The advantage of a controlled experiment is the ability to isolate a cause-effect relationship, eliminate confounding effects, and ensure high internal validity, which is of importance to this study. The focus on a single cause-effect relationship tends to work counter to external validity, an acknowledged trade-off in controlled experiments.

However, we understand our study as the starting point for further research focusing on high external validity, for example, by means of a eld experiment (e.g., investigating client pressure and valuation judgments in a real-world setting).

The experimental design shown in Exhibit 1 follows Hansz and Diaz (2001). The experiment has one factor or independent variable (transaction price feedback) xed at three levels: ‘‘too high’’ feedback, ‘‘too low’’ feedback, and no feedback. The too high and too low feedback groups are the treatment groups, while the no feedback group is a pre-posttest control group. A posttest-only validity control group is additionally introduced to assess whether a testing bias is present. Subjects may be affected merely by participating in a repeated measures / pre-posttest experiment, which represents a threat to internal and construct validity.

Residential expert appraisers, de ned as Oregon State certi ed residential appraisers active in the Portland MSA, are used as experimental subjects. All expert appraisers are RESIDENTIAL REAL ESTATE APPRAISAL BIAS131 Exhibit 1. Experimental Design Group Attribute Transaction price feedback group: too low R O X 1 O Transaction price feedback group: too high R O X 2 O No feedback control group R O O Validity control group RO Note: This table presents our experimental design. ‘‘R’’ indicates random assignment, ‘‘O’’ an observation or measure, and ‘‘X’’ a treatment. Subjects are randomly assigned to each of the four experimental groups (R). Each group has 10 subjects. The ‘‘too low,’’ ‘‘too high,’’ and no feedback groups receive a rst valuation case of a vacant residential property. The value estimate of each appraiser is recorded and represents the rst observation (O). Subjects in the treatment groups, too low and too high feedback, then receive their treatment (X). The treatment is a seller’s broker’s note with the transaction price for the property they valued in the rst valuation case. The realized transaction price is either 15% below the individual estimate (too high feedback group) or 15% above the individual estimate (too low feedback group). Subjects in all groups are then provided with a second valuation case of an unrelated vacant residential property. Their value estimates are again recorded and represent the second observation (O).

independent residential appraisers and not in-house appraisers for mortgage lending institutions. Following Hansz and Diaz (2001), we randomly assign 10 subjects to each experimental group (too high, too low, posttest only, and pre-posttest control group) resulting in a total sample size of 40. This sample size represents a trade-off between statistical power and experimental feasibility considerations. Experimental subjects are randomly selected from a list of active Oregon State certi ed residential appraisers obtained from the Oregon Appraiser Certi cation and Licensure Board (OACLB). We contacted each randomly selected appraiser by phone or, if the phone number was missing, by email. If an appraiser declined participation in the experiment, another appraiser was randomly selected from the list and contacted. This procedure was repeated until the full sample of 40 appraisers was achieved. While experimental and survey based studies commonly face the threat of selection bias due to the availability of subjects and the uncertainty of who responds to the invitation to participate, we are con dent that our sample is representative of the underlying population as its characteristics are in line with those of samples used in previous experimental appraisal studies (as discussed below). Before the experiment, participants were informed that they have to conduct two simpli ed valuations of vacant residential land as part of a study investigating residential appraisal behavior. Participants, however, did not receive any information about the experimental manipulation or the speci c purpose of the experiment in order to avoid a threat to construct validity from hypothesis guessing.

The experiment was conducted as follows: First, appraisers in the treatment and control groups were given the rst valuation case, which required them to value a vacant lot of residential land in Roswell, Georgia. After reviewing the information provided in the appraisal case and employing the sales comparison approach, appraisers were required to write their nal estimate on a sheet of paper.

As a next step, appraisers in the treatment groups were given a seller’s broker’s note (treatment / manipulation), which is the transaction price feedback containing implicit client feedback in line with Hansz and Diaz (2001). This note was prepared for each 132JULIA FREYBOTE, ALAN J. ZIOBROWSKI, AND PAUL GALLIMORE appraiser individually during the experiment (i.e., while the appraiser was writing down his / her nal estimate for case 1). The experimenter made sure that the appraisers were distracted with nalizing case 1 and did not notice the preparation of the note in order to avoid a failed manipulation. The note stated the ctitious sales price of the subject property that the appraiser valued previously (case 1). For subjects in the too low feedback group, the transaction price was manipulated by adding 15% to the estimate of each individual appraiser for valuation case 1. For the too high feedback group, the transaction price was manipulated by deducting 15% from the case 1 value estimate given by each subject. The 15% difference between value estimate for case 1 and the sales price is in line with Hansz and Diaz (2001). No transaction price feedback was provided to the control group.

In a nal step, appraisers were given a second, unrelated hypothetical valuation case set in Newnan, Georgia. The second valuation case was given to all four groups: too high, too low, control group, and posttest-only control group. Appraisers were again asked to write down their value estimate for this case. Additionally, appraisers were asked to complete an exit questionnaire with demographic and professional questions, as well as manipulation checks. The experiments were conducted from October to December 2011 and subjects required on average 40 minutes to complete the cases.

The two valuation cases used in this study represented two hypothetical simpli ed valuations of vacant residential land, structured following Hansz and Diaz (2001). They required appraisers to use the sales comparison approach to arrive at a value estimate.

Each case included the identi cation of the subject property, purpose of the appraisal, neighborhood data, property data, and ve comparable sales. The property in valuation case 1 was a 0.42-acre vacant residential lot located in Roswell, Georgia (Fulton County) while the property in valuation case 2 was a 0.2-acre vacant residential lot located in Newnan, Georgia (Coweta County). Placing residential appraisers from Portland, Oregon into a geographically unfamiliar environment (Georgia) is in line with previous behavioral studies (e.g., Diaz and Hansz, 1997; Tidwell and Gallimore, 2014). Geographical unfamiliarity increases the complexity and uncertainty of an appraisal assignment, which is likely to increase the probability that subjects respond to the treatment. Valuation cases in geographically unfamiliar areas additionally eliminate potential confounding effects resulting from the varying familiarity of experimental subjects within geographically familiar areas (e.g., different submarkets in Portland). Comparable sales (comps) for both valuation cases were similar to the respective subject properties for cases 1 and 2 in terms of features such as zoning, nancing, location, and available utilities. To avoid that appraisers simply average comp sales prices to get an estimate, sales prices varied widely and required appraisers to thoroughly evaluate the features of comps against features of the subject property. Increasing the discrepancy of comps transaction prices did not reduce their credibility as, according to experimental subjects, the vacant residential land market at the time of the experiments showed similar characteristics (i.e., sales prices were ‘‘all over the place’’). Analogously to Hansz (1999), ctitious transaction prices and sales dates were used to rstly, eliminate any price trend in consecutive sales and secondly, provide appraisers with a range of superior and inferior properties compared to the subject properties while controlling for potentially confounding effects. The valuation cases were ne-tuned in a pilot-study with six expert appraisers. RESIDENTIAL REAL ESTATE APPRAISAL BIAS133 We employ the independent samplest-test to analyze our experimental data. The research hypotheses are tested based on the difference of an individual appraiser’s estimate for valuations 2 and 1 ( DIFFVAL). As the difference is negative, it is multiplied by 1.

Compared to using the value estimates for the second valuation case only, this approach is considered more appropriate as it makes appraisers more comparable. Some appraisers may be more conservative (less conservative) than others and their estimates for the second valuation case may thus be lower (higher). As these lower (higher) estimates are not the result of experimental manipulation, but personal preferences, we decided to eliminate this potentially confounding effect. DIFFVALallows the analysis of the adjustments made from valuation case 1 to case 2 without these fundamental tendencies of individual appraisers. Assumption testing indicates that the assumptions of theparametric t-test, normality, and equality of variances, are not violated by our experimental dataset ( DIFFVAL). However, due to small sample sizes, a major threat to statistical conclusion validity in experimental research is low statistical power. We conduct a post-hoc power analysis using GPOWER to investigate whether low statistical power is an explanation for insigni cant results [for more information about GPOWER, see Erdfelder, Faul, and Buchner (1996)].

As a robustness check, we use bootstrapping and derive bootstrapped con dence intervals. While the t-test and its non-parametric counterpart, the Mann-Whitney U test, have been traditionally used to analyze experimental data, bootstrapping is an alternative technique in experimental real estate research. Bootstrapping has been suggested as an appropriate technique to analyze experimental data in a number of elds (e.g., zoology, drug testing, immunoassay, production control, political science, real estate valuation), in which only small sample sizes are available, feasible or ethically justi able (Kuhle and Moorehead, 1990; Mooney, 1996; Jones and Rocke, 1999; Prodan and Campean, 2005; Ivanescu, Bertrand, Fransoo, and Kleijnen, 2006). Bootstrapping has two major advantages for small sample sizes. Firstly, it can be used for calculating inferential statistics even if distributional characteristics are not known or the assumptions of parametric tests are violated (e.g., non-normality, highly skewed) (Mooney, 1996). Secondly, bootstrapping provides a solution to the problem of low statistical power in experimental research, particularly for small and medium effect sizes.

The fundamental difference between bootstrapping and traditional statistical inferential tests, such as the parametric t-test or ANOVA, is that the latter are based on probability theory. Instead of making assumptions about the underlying population and sampling distribution (e.g., normality, central limit theorem), bootstrapping creates the sampling distribution from observed sample data (Hesterberg et al., 2002). The basic idea of bootstrapping is to create resamples (e.g., 1,000, 10,000, 100,000) with or without replacement from an observed data set (e.g., experimental sample data). The resamples allow the development of a sampling distribution for the statistics of interest (e.g., mean, median, difference between means). The sampling distribution of the statistic of interest then allows inferences about the respective population parameter. If the sample is a good representation of the actual population, bootstrapping will produce a good approximation of the sampling distribution of the population parameter (Cugnet, 1997).

In our analysis, we use the percentile bootstrap interval approach, in which a con dence interval is calculated around the statistic of interest. For an alpha of 5%, the con dence 134JULIA FREYBOTE, ALAN J. ZIOBROWSKI, AND PAUL GALLIMORE interval is between the 2.5 th and 97.5 th percentile of the sampling distribution. The percentile interval method has been found to be approximately correct for small samples (Kenett, Rahav, and Steinberg, 2006). Wood (2005) argues that the percentile bootstrap interval approach is most appropriate if the following conditions are satis ed: First, random sampling is used, resulting in an initial random sample. Second, the guessed population based on the resamples is ‘‘similar’’ to the real population. Third, the estimate is unbiased (i.e., the statistic derived from the sample data corresponds to the statistic derived from the sampling distribution). Fourth, the distribution of the resample statistic should be ‘‘reasonably’’ symmetric. Fifth, the error distribution should be independent of the true parameter value [for a more detailed discussion, see Wood (2005)].

Our experimental data ( DIFFVAL) satis es the prerequisites of the percentile interval method. The experimental sample is the result of random selection and assignment. Thus, it can be considered representative of the actual population. The estimate is assumed to be unbiased. The mean of the DIFFVALsampling distribution ( N 10,000) is equal to the DIFFVAL mean of the original sample ( N 10) for each experimental group. This suggests no bias is present (Hesterberg et al., 2002). The error distribution is assumed to be independent and not affected by serial correlation. The sampling distribution for the difference between the mean DIFFVALfor the too low feedback group and the no feedback control group is slightly skewed (S 0.15) and has a slight kurtosis of 0.034.

However, with regard to the requirements of the percentile interval method (Wood, 2005), the distribution can be considered ‘‘reasonably’’ symmetric. The sampling distribution of the mean difference of DIFFVALfor the too high feedback and no feedback control group is slightly skewed (S 0.006) and has a kurtosis of 0.118. It can also be considered ‘‘reasonably’’ symmetric.

We calculate the bootstrap con dence interval for the difference between our experimental groups as follows: We take 10,000 resamples from the DIFFVALdata for each experimental group (sampling with replacement). The means of the resamples for each experimental group are recorded. For each treatment group, the resample mean is subtracted from the respective resample mean for the control group (e.g., resample 1 mean for the too low feedback group is deducted from the resample 1 mean for the no feedback control group). The 10,000 mean differences are recorded. In a nal step, the 2.5 thand 97.5 thpercentile for the sampling distribution of the mean difference in DIFFVAL between groups is determined. ResultsResults We present the sample pro le, descriptive statistics, and results of our statistical analysis using the independent samples t-test and bootstrapping as robustness check in this section. We also include the results of the effect size and post-hoc power analysis.

Sample Pro le and Descriptive Statistics Exhibit 2 provides an overview of the sample pro le. The majority of the participants in this study (77.5%) were male. On average, subjects were 51 years old and had 20 years of experience in residential real estate appraisal. Most participants (65%) are highly RESIDENTIAL REAL ESTATE APPRAISAL BIAS135 Exhibit 2. Sample Pro le Characteristic Summary Gender Male77.5% Female 22.5% Age (in years) 50.7 Experience (in years) 20 Education High school 2.5% Some college 32.5% Bachelor degree 45% Graduate degree 20% Share of residential valuation 97% Share of appraisers with additional certi cations / designations 45% Note: This table presents the sample pro le. Age, experience, share of residential valuation, and share of appraisers with additional certi cations / designations represent means based on a sample of 40 residential expert appraisers.

educated with a bachelor and / or graduate degree. On average, 97% of subjects’ work comes from residential appraisal and 45% of all participants hold additional appraisal designations, such as senior residential appraiser (SRA) or independent fee appraiser (IFA). Our sample pro le is similar to the pro les of other studies such as Wolverton and Gallimore (1999; commercial and residential appraisers), Hansz (2004; commercial appraisers), and Hansz and Diaz (2001; commercial appraisers) with regards to the age, work experience, education, additional quali cations, and gender of subjects. Thus, our sample is reasonably representative of appraisers in the U.S.

Exhibit 3 presents descriptive statistics for the rst and second valuation case separated by the experimental group. The group means for valuation case 1 are not statistically different at the 5% level, which indicates no validity threatening pre-test differences between experimental groups. The means of the pre-posttest no feedback control group and the validity control group for the second valuation case are not signi cantly different at the 5% level. Thus, the testing bias does not threaten the internal and external validity of our investigation. Exhibit 4 presents the descriptive statistics for DIFFVALfor each of the pre-posttest experimental groups.

Independent Samples t-test In Hypothesis 1, we posit that the value estimates of appraisers receiving the too low feedback will not differ from the estimates of the control group (i.e., will not be higher).

This translates into the testable hypotheses as shown in equations 1 and 2.

Hypothesis H :DIFFVAL DIFFVAL . (1) O too low NF Hypothesis H : DIFFVAL DIFFVAL . (2) Atoo low NF DIFFVALis the mean difference between value estimates for case 1 and 2; too lowis the too low feedback group, and NFis the no feedback control group. With regard to the 136JULIA FREYBOTE, ALAN J. ZIOBROWSKI, AND PAUL GALLIMORE Exhibit 3. Descriptive Statistics Too Low Feedback Too High Feedback No Feedback Panel A: Valuation Case 1 Mean $167,210 $160,334 $154,792 Median $164,064 $157,500 $152,341 Std. Dev. $22,169 $16,448 $15,184 Min. $134,776 $130,200 $132,510 Max. $208,250 $192,000 $180,000 Range $73,474 $61,800 $47,490 Panel B: Valuation Case 2 Too Low Feedback Too High Feedback No Feedback Validity Control Mean $75,069 $72,034 $69,009 $75,393 Median $77,500 $76,000 $66,481 $77,500 Std. Dev. $8,075 $12,775 $7,629 $7,193 Min. $58,800 $45,000 $60,000 $63,469 Max. $84,624 $85,000 $82,000 $82,140 Range $25,824 $40,000 $22,000 $18,671 Note: The table presents descriptive statistics for the rst and second valuation case. These descriptive statistics are based on a sample size of 30 for the rst valuation case and a sample size of 40 for the second valuation case. Exhibit 4. Descriptive Statistics ( DIFFVAL ) Too Low Feedback Too High Feedback No Feedback Mean $92,141 $88,300 $85,783 Median $86,600 $86,400 $86,750 Std. Dev. $23,259 $15,521 $19,114 Min. $70,713 $69,118 $55,322 Max. $142,447 $110,000 $120,000 Range $71,734 $40,882 $64,678 Note: The statistical analysis is based on the difference between value estimates for the rst and second valuation case. DIFFVAL is calculated by subtracting the rst case value estimate of each appraiser from the second case value estimate (V2-V1). As the difference is negative, DIFFVAL is multiplied by 1. DIFFVAL is based on a sample size of 30.

research hypothesis, we expect to fail to reject the null hypothesis. The mean for the too low feedback group is $92,141, while the mean for the control group is $85,783. As shown in Exhibit 5, these means are not statistically different at the 5% level. These ndings are in line with our expectation.

We posit in Hypothesis 2 that appraisers receiving the too high feedback will make no signi cantly lower subsequent value estimates than the control group. The testable hypotheses are shown in equations (3) and (4). RESIDENTIAL REAL ESTATE APPRAISAL BIAS137 Exhibit 5. Results of Hypothesis Testing and Power Analysis Residential Appraisers (2011) Commercial Appraisers (2001) a T-stat. Effect Size Power T-stat. Effect Size Power Hypothesis 1 0.668 0.30 0.10 2.067** 0.92 0.63 Hypothesis 2 0.323 0.14 0.09 0.782 0.35 0.19 Note: The table presents the t-statistics for the parametric t-test. The analysis is based on DIFFVAL and a sample size of 30. It also presents the results of an effect size and post-hoc power analysis. Sample effect sizes are calculated as Cohen’s d. The post-hoc power analysis was conducted using GPOWER ( 0.05, a sample size of 10 for each group, and the above effect sizes). aBased on Hansz and Diaz (2001).

** Signi cant at the 5% level. Hypothesis H : DIFFVAL DIFFVAL . (3) O too high NF Hypothesis H : DIFFVAL DIFFVAL . (4) Atoo high NF DIFFVALis the mean difference between value estimates for both valuation cases, too high is the too high feedback group, and NFis the no feedback control group. The mean for the too high feedback group is $88,300 and the mean for the control group is $85,783.

These group means are not statistically different from each other at the 5% level (Exhibit 5). The null hypothesis cannot be rejected, which is in line with our expectation that the absence of client feedback eliminates the relationship of transaction price feedback and residential appraisal judgment.

Post-hoc Power and Effect Size Analysis While the results are in line with our expectations, we have to exclude alternative explanations for them, particularly the non-reception of treatment by subjects and low statistical power. While administering the treatment, the experimenter made sure that each subject read through the seller’s broker’s note before proceeding to the second valuation case. Thus, the argument that subjects have not ‘‘received’’ the treatment (manipulation) can be rejected. Most subjects were surprised about the transaction price as it deviated from their estimates for the rst valuation case and no additional information about the particular transaction was given. However, the experimental manipulation required the deviation of the value estimate and realized transaction price for the rst valuation case. The increased uncertainty was expected to increase the likelihood of subjects responding to the treatment. Additionally, it re ects the nature of real estate markets, which are characterized by limited data, segmentation, and proprietary information. Appraisers in their professional practice do not have complete and unambiguous information about transactions and residential real estate markets. Thus, we assume subjects considered the feedback to be plausible and trustworthy in the experimental context.

Low statistical power is another plausible explanation for our insigni cant results. Effect sizes and power for each of the two hypotheses are shown in Exhibit 5. These effect sizes are descriptive statistics based on a sample and on their own allow no inferences. 138JULIA FREYBOTE, ALAN J. ZIOBROWSKI, AND PAUL GALLIMORE Exhibit 6. Descriptive Statistics of Sampling Distributions for the Mean Difference Too Low Feedback and Control Group Too High Feedback and Control Group Mean $6,386$2,546 Median $6,208$2,644 Std. Dev. (Error) $8,906 $7,350 Min. $26,222 $24,667 Max. $43,790 $31,228 Skewness 0.15 0.006 Kurtosis 0.034 0.118 N 10,000 10,000 Note: The table presents the descriptive statistics for the sampling distributions of the mean differences between the control group and the two treatment groups. For each of the three experimental groups, 10,000 bootstrap resamples are taken from the original experimental sample (sampling with replacement; N 10 per group). The means of each resample are recorded. In a next step, a resample mean of each treatment group is subtracted from the respective resample mean of the control group yielding 10,000 mean differences. This represents the sampling distribution for the mean differences between each treatment group and the control group. In a nal step, a percentile con dence interval is calculated (2.5 thand 97.5 th percentile) and reported in the text.

Neither the t-statistic for the too low hypothesis nor the t-statistic for the too high hypothesis exceeds the respective critical t-values. Thus, no conclusion about effect sizes in the underlying population can be made. However, as statistical power is low, it cannot be determined whether the effect does indeed not exist in the population or simply cannot be detected. Increasing our experimental sample size to increase power, however, is not feasible. Bootstrapping as a robustness check mitigates this shortcoming of the independent samples t-test and is able to provide additional results, even in the case of a small experimental sample and small effect size.

However, while low statistical power does not allow for any conclusions about whether the effect identi ed in our sample exists in the underlying population, effect size analysis nevertheless provides valuable information for our investigation. If the transaction price feedback-induced appraisal bias persisted in the absence of direct client feedback, we would expect an effect size similar to the one found in Hansz and Diaz (2001). In Exhibit 5, we compare the effect sizes of this study and the study by Hansz and Diaz (2001).

Categories of effect sizes are relative and depend on discipline, operationalization, and context; however, the effect sizes (Cohen’s d) for our too high and too low hypotheses can be considered small, while the effect size for the too low hypothesis in Hansz and Diaz (2001) can be considered large. The small effect size for both our hypotheses indicates that the absence of client feedback eliminates the effect of transaction price feedback on residential appraisal judgments.

Robustness Check: Bootstrapping Exhibit 6 provides the descriptive statistics of the sampling distribution for the mean difference in DIFFVALbetween the too low feedback and no feedback group. The mean RESIDENTIAL REAL ESTATE APPRAISAL BIAS139 of the sampling distribution is $6,386 and the standard error is $8,906. The 2.5 th percentile of this distribution is $10,284.61 and the 97.5 th percentile is $24,171.90.

Thus, there’s a 95% chance that the mean difference is between 10,284.61 and 24,171.90. The bootstrap interval includes 0 and therefore, the null hypothesis of no difference between the sample means for DIFFVALcannot be rejected. This result is in line with the expectation that the too low feedback has no impact on residential appraisers.

As shown in Exhibit 6, the sampling distribution mean of the difference in DIFFVAL between the too high feedback and no feedback group is $2,546 and the standard error is $7,350. The respective (2.5 th; 97.5 th) percentile interval is $11,807.30 to $16,756.21.

As the con dence interval includes 0, the null hypothesis of no mean difference between the too high feedback and no feedback group cannot be rejected. The ndings of using bootstrap con dence intervals are in line with those of the independent samples t-test. DiscussionDiscussion Our statistical analysis using bootstrapping eliminated low statistical power as an explanation for the insigni cant results of this investigation. Thus, our ndings indicate that the HVCC, Dodd-Frank Act, and AMCs have altered the relationship of transaction price feedback and residential appraisal judgment by eliminating direct client feedback.

The current residential appraisal task environment is very different from the pre-2007 in which lenders were more likely to in uence appraisers to deliver a certain value estimate (e.g., similar or higher than the pending mortgage amount). As Hansz and Diaz (2001) discuss, market feedback implicitly includes client feedback and commercial appraisers in their study subconsciously responded to the too low feedback as they received this type of feedback from their clients (e.g., lenders) on a regular basis in their professional life. While residential mortgage lenders at the time this study was conducted may have been interested in appraisals at the lower end of the justi able range to reduce their risk exposure, residential appraisers did not receive any too high feedback from lenders as they had no direct contact with them and thus were not likely to subconsciously respond to the too high treatment in our experiment.

In this study, AMCs represent the most important client group of experimental subjects (62%; Exhibit 7) as opposed to mortgage lenders, which were the most important client group (56.6%) in Hansz and Diaz (2001) based on Hansz (1999). We asked our experimental subjects whether they experience pressure from AMCs to arrive at a certain value. A number of residential appraisers con rmed that, compared to the pre-HVCC / Dodd-Frank environment, AMCs do not pressure appraisers to validate a pending mortgage amount. Thus, residential appraisers agreed that AMCs do not exert direct client feedback. While AMCs put residential appraisers under pressure to reduce fees, follow certain guidelines, and have a short turnaround time, they do not require appraisers to arrive at a certain value estimate (i.e., give no too high or too low feedback. The absence of direct client feedback through the introduction of AMCs thus helps us to explain why we nd no evidence of an appraisal bias (i.e., an impact of transaction price feedback on appraisal judgments). 140JULIA FREYBOTE, ALAN J. ZIOBROWSKI, AND PAUL GALLIMORE Exhibit 7. Appraisal Client Pro le Full SampleToo Low Feedback Too High FeedbackNo FeedbackValidity Control Mortgage lenders 17.9% 14.6% 18.7% 14% 24.3% Individual homebuyers / sellers 5.2% 2.7% 9.7% 4.7% 3.8% AMC 62.2% 71.5% 60.8% 61.3% 55.2% Governmental agencies 7.5% 2.7% 5% 6% 16.2% Other 7.2% 8.5% 5.8% 14% 0.5% Note: The above table presents the client pro le of residential expert appraisers used in this study. The percentages represent means based on a sample size of 40. ConclusionConclusion Previous studies have investigated the impact of client and market feedback on appraisal judgments and found an upward bias. However, all these studies were conducted in an environment (residential and commercial) in which appraisers were in direct contact with their clients, particularly lenders. This study extends the ndings of Hansz and Diaz (2001) and investigates the relationship of transaction price feedback and appraisal judgment in a fundamentally changed residential appraisal industry. The Dodd-Frank Act disconnects lenders and residential appraisers by introducing AMCs as intermediaries. We thus hypothesize that the absence of direct client feedback eliminates the effect of transaction price feedback on residential appraisal judgments.

Using a controlled pre-posttest experiment following Hansz and Diaz (2001) and residential expert appraisers from Portland, Oregon, we posit two hypotheses and test two alternative inferential approaches. First, the parametric independent samples t-test is used. No statistical signi cance is found for the mean difference between experimental groups. However, a post-hoc power analysis reveals low statistical power and small effect sizes. We use bootstrapping as a robustness check. Bootstrapping represents a solution for the analysis of experimental data suffering from low power. The advantage of the bootstrapping technique for this investigation is that low statistical power can be excluded as an explanation for the insigni cant results. Our bootstrapping results, however, also show no difference between experiment group means.

In conclusion, we nd evidence that transaction price feedback fails to introduce an appraisal bias into residential valuation judgments in the absence of direct client feedback.

While Hansz and Diaz (2001) nd an asymmetric appraisal bias, our results indicate that changes to the lender-appraiser relationship introduced by the HVCC and Dodd-Frank Act lead to the unbiased behavior of appraisers. Thus, this legislation appears to be effective.

This study represents a starting point for additional research into the impact of market feedback on appraisal judgments. While our study and that of Hansz and Diaz (2001) focus on the biasing effect of this type of feedback, future studies could investigate whether outcome feedback (e.g., transaction price feedback) or more complex feedback types actually help to improve appraisal performance and judgment. Such investigations could also include databases such as CoStar or MLS and the types of market feedback RESIDENTIAL REAL ESTATE APPRAISAL BIAS141 they provide. Additionally, future studies could investigate the impact of AMCs on appraisal behavior, for example with regard to the quality of appraisals or the susceptibility of appraisers working for AMCs to heuristic biases. References Abernethy, A.M. and H. Hollans. The Home Valuation Code of Conduct and its Potential Impacts. Appraisal Journal , 2010, 78:1, 81–93.

Chinloy, P., M. Cho, and I.F. Megbolugbe. Appraisals, Transaction Incentives, and Smoothing.

Journal of Real Estate Finance and Economics , 1997, 14, 89–111.

Cho, M. and I.F. Megbolugbe. An Empirical Analysis of Property Appraisal and Mortgage Redlining. Journal of Real Estate Finance and Economics , 1996, 13, 45–55.

Cugnet, P. The Original Bootstrap Method. Available at: http: / / scholar.lib.vt.edu / theses / available / etd-61697-14555 / unrestricted / Ch4.pdf, 1997. Accessed December 9, 2011.

Diaz, J. III. An Investigation into the Impact of Previous Expert Value Estimates on Appraisal Judgment. Journal of Real Estate Research , 1997, 13:1, 57–66.

Diaz, J. III and J.A. Hansz. How Valuers Use the Value Opinions of Others. Journal of Property Investment & Finance , 1997, 15:3, 256–60.

——. The Use of Reference Points in Valuation Judgment. Journal of Property Research, 2001, 18:2, 141–48.

——. A Taxonomic Field Investigation into Induced Bias in Residential Real Estate Appraisals.

International Journal of Strategic Property Management , 2010, 14, 3–17.

Erdfelder, E., F. Faul, and A. Buchner. GPOWER: A General Power Analysis Program. Behavior Research Methods, Instruments, & Computers , 1996, 28, 1–11.

Gallimore, P. and M. Wolverton. Price-knowledge-induced Bias: A Cross-Cultural Comparison.

Journal of Property Investment and Finance , 1997, 15:3, 261–73.

Hansz, J.A. The In uence of Market Feedback on the Appraisal Process. Georgia State University, Dissertation, 1999.

——. The Use of a Pending Mortgage Reference Point in Valuation Judgment. Journal of Property Investment & Finance , 2004, 22:3, 259–68.

——. Valuation Bias in Commercial Appraisal: A Transaction Price Feedback Experiment. Real Estate Economics , 2001, 29:4, 553–65.

Hesterberg, T., S. Monaghan, D.S. Moore, A. Clipson, and R. Epstein. Bootstrap Methods and Permutation Tests. Chapter 18 in The Practice of Business Statistics . New York: W.H. Freeman and Company, 2002.

Ivanescu, V.C., J.W. Bertrand, J.C. Fransoo, and J.P. Kleijnen. Bootstrapping to Solve the Limited Data Problem in Production Control: An Application in Batch Process Industries.

Journal of the Operational Research Society , 2006, 57, 2–9.

Jones, G. and D.M. Rocke. Bootstrapping in Controlled Calibration Experiments, Technometrics , 1999, 413, 224–33.

Kenett, R.S., E. Rahav, and D.M. Steinberg. Bootstrap Analysis of Designed Experiments.

Quality and Reliability Engineering International , 2006, 22, 659–67. 142JULIA FREYBOTE, ALAN J. ZIOBROWSKI, AND PAUL GALLIMORE Kinnard, W.N., M.M. Lenk, and E.M. Worzala. Client Pressure in the Commercial Appraisal Industry: How Prevalent Is It? Journal of Property Investment & Finance , 1997, 15:3, 233– 44.

Kuhle, J.L. and J.D. Moorehead. Applying the Bootstrap Technique to Real Estate Appraisal:

An Empirical Analysis. Journal of Real Estate Research , 1990, 5:1, 33–40.

Leung, P.W. and K.T. Trotman. Effect of Different Types of Feedback on the Level of Auditor’s Con gural Information Processing. Accounting and Finance, 2008, 48: 301–18.

Levy, D. and E. Schuck. The In uence of Clients on Valuations. Journal of Property Investment & Finance , 1999, 17:4, 380–94.

——. The In uence of Clients on Valuations: The Clients’ Perspective. Journal of Property Investment & Finance , 2004, 23:2, 182–201.

Mooney, C.Z. Bootstrap Statistical Inference: Examples and Evaluations for Political Science.

American Journal of Political Science , 1996, 40:2, 570–73.

O ¨ nkal, D. and G. Muradoglu. Effects of Feedback on Probabilistic Forecasts of Stock Prices.

International Journal of Forecasting , 1995, 11, 307–19.

Prodan, A. and R. Campean. Bootstrapping Methods Applied for Simulating Laboratory Works.

Campus-Wide Information Systems , 2005, 22:3, 168–75.

Tidwell, O. and P. Gallimore. The In uence of a Decision Support Tool on Real Estate Valuations. Journal of Property Research , 2014, 31:1, 45–63.

Wood, M. Bootstrapped Con dence Intervals as an Approach to Statistical Inference.

Organizational Research Methods , 2005, 8:4, 454–70.

Wolverton, M.L. Investigation into Price Knowledge Induced Comparable Sale Selection Bias.

Georgia State University, Dissertation, 1996.

Wolverton, M.L. and P. Gallimore. Client Feedback and the Role of the Appraiser. Journal of Real Estate Research , 1999, 18:3, 415–31.

Zhu, S. and R.K. Pace. Distressed Properties: Valuation Bias and Accuracy. Journal of Real Estate Finance and Economics , 2012, 44, 153–66. Julia Freybote, Portland State University, Portland, OR 97207 or [email protected].

Alan J. Ziobrowski, Georgia State University, Atlanta, GA 30302 or aziobrowski@ gsu.edu.

Paul Gallimore, Massey University, Albany, New Zealand or [email protected]. R epro duce d w ith p erm is sio n o f th e c o pyrig ht o w ner. F urth er r e pro ductio n p ro hib ite d w ith out p erm is sio n.