Are We Quacks?
28 November 2011 at 5:23 am Lasse 14 comments
| Lasse Lien |
Rich Bettis makes an important point in a forthcoming issue of SMJ. Bettis points out how two unfortunate practices interact with each other to create a very serious and fundamental problem for knowledge accumulation in (strategic) management.
One is the widespread practice of running numerous regressions on a given dataset and subsequently adapting (or in milder cases “tuning”) hypotheses or theory to fit the data. By itself this practice is quite unfortunate, since data patterns can and will occur by chance, and the more regression models one tries the more likely that one will “find” something. We obviously do not want such random patterns to influence either theory building or our catalog of empirical findings. However, this problem would be a great deal less serious if replication studies were common and we gladly published non-findings. Random correlations in the data would not survive replication tests, and would be eliminated fairly quickly.
As we all know, in management, replication studies cannot get published and are basically just not done. To make matters worse, we don’t publish non-findings either. This is the second unfortunate practice. Taken together these two practices may in the worst case indicate that much of what we think we know in management are just random data patterns, discovered through data mining, and protected by our lack of replication studies and refusal to publish non-findings. This is a sobering thought. As Bettis points out, we should all be very thankful that replication studies are more common in medical research than in management.
What is the solution? Well, a first step might be to launch the Journal of Managerial Replication Studies and give it the prestige it deserves. Either SMS or AOM should see the launch of such a journal as a crucial responsibility. I mean, we really don’t want to be quacks, do we?
HT: Helge Thorbjørnsen
Entry filed under: - Lien -, Management Theory, Methods/Methodology/Theory of Science, Papers, Strategic Management.
1.
Michael Marotta | 28 November 2011 at 6:55 am
Over on OrgTheory, Fabio Rojas is questioning the value of useless college majors, an investigation that may have unintended consequences. It is surely true that management people hire others like themselves. Their being quacks would remain independent of that until and unless equity holders take management control of their firms.
Alternately, human beings are not billiard balls. We hold simple physics as a paradigm when it may not be applicable to highly complex, indeterminate, chaotic, and singular phenomena. As we know from studies of entrepreneurship, failure may be predictable, but success is less tractable. Like entrepreneurship, successful management may not be replicable by controlled study.
Furious regression analysis may be a puppy chasing his tail, but that could be suggested for double entry bookkeeping, algebra, and calculus. If mathematics reveals nature, then valid analysis brings subtle truths.
And then, there is the placebo effect. Placebos work because they bring peace of mind, taking the body out of panic mode, and allowing recovery and healing. So, too, may quack management theorists allow us to move out of crisis and into analysis. After all, astrology led to astronomy. If not for earlier quacks, we might not have launched yet another Mars probe the other day.
2.
David Hoopes | 28 November 2011 at 11:56 am
I think Rich B.’s points are valid. I’ve had reviewers suggest I fit my a priori theory to the data. It’s qualitative so it doesn’t matter the way it does to statistical inference. But, since the same reviewers were throwing a lot of hypothesis testing language at us I was left with the bad feeling that these learned scholars would do the same for a sample of data used in hypothesis testing. Obviously, all the sampling techniques are meaningless if you change the theory to fit the sample. I’m often left with the feeling that many management scholars don’t understand the statistics they use.
3.
Thomas | 29 November 2011 at 2:20 am
A related problem is the lack of straightforward practical criticism of published work (especially highly influential work). While it is possible to praise a scholar’s “literary performances”, it is very difficult to publish work that carefully examines the style, argument and sourcing of such performances. Especially where such examination draws the scholarship of influential scholars into question.
I think replication is resisted for similar reasons. We worry that if we allow such things, we’ll open the pages of the journals to kranks. Seems that’s the choice: quacks or kooks.
4.
Lasse | 29 November 2011 at 2:31 am
I think we can avoid the whole problem by issuing a dictate that the only research allowed is grounded theory
5.
srp | 29 November 2011 at 5:05 am
1. Ionnaides has argued that this type of data snooping and selective publication is actually a huge problem in the biomedical literature, to the extent that most published results are probably wrong. There’s been a lot of publicity about his work.
2. Most published empirical findings in management aren’t clear or earthshaking enough, especially in practitioner terms, to motivate anyone to replicate or refute them.
3. In many cases, papers have idiosyncratic operationalizations of theory suited to a specific industry context and can’t be cleanly replicated or disconfirmed with data from a different context. If someone shows an interaction effect between X and Y on performance in the telecom industry but someone else fails to find the effect in the hotel industry, it’s usually not clear if that is a disconfirmation, the identification of a contingency, or a problem with the operationalizations used in either industry study.
4.The findings that people do widely believe, such as the efficacy of goal-setting and the correlation of asset specificity with higher governance safeguards, have been tried out on numerous data sets.
5. Formal “solutions” to the data snooping problem do exist, such as requiring blind holdout subsamples on which no exploratory analysis is performed. Or some version of Bayesian analysis, such as Leamer’s Extreme Bounds Analysis, can be used to discipline specification searches.
6.
Rich Makadok | 29 November 2011 at 2:22 pm
Yes, there is a lot of crap clogging up our journals. And, as usual, Steve is right on the money about all of the reasons why. Here’s an amusing cartoon on the data-dredging issue:
http://xkcd.com/882/
There have also been some widely publicized examples of this problem in public health, like studies of whether exposure to electromagnetic fields (e.g., power lines) causes diseases (e.g., cancer). Some studies collected dozens of different measures of EMF exposure and correlated them with dozens of different health outcomes, for a total of hundreds or even thousands of different correlations. There were many insignificant correlations, and even some significantly negative correlations (i.e., apparent health BENEFITS of EMF exposure), but naturally the only ones that got submitted for publication were the small number of significant positive correlations.
Surely econ must face the same problems. How do the econ journals address this problem?
As David’s comment above suggests, I would bet that a large part of the problem is just the basic fact that many authors and reviewers simply are either not aware of, or at least not sufficiently sensitive to, these problems. Maybe we can’t prevent all opportunistic behavior since any system can be gamed or circumvented by someone with sufficient guile, but I would bet that most people in this field want to do the right thing — or at least don’t want to risk getting caught doing the wrong thing. So, if we establish community norms that are clearly articulated and widely broadcast, then I suspect that the problem will diminish dramatically.
Based on this optimistic viewpoint, here are my two cents about simple, common-sense, low-cost ways to mitigate the problem:
1.) Our PhD programs should provide explicit training on research ethics, which should include all of the issues discussed here. Yes, I know that this is not a panacea, and MBA-level ethics training has not prevented any number of recent business scandals. But it has to be at least some improvement over the status quo for us to clearly tell newcomers to the profession about what is OK and what is not OK. And, per Steve’s comment, the works of Ed Leamer should figure prominently in such training.
2.) Our journals should require all authors who submit manuscripts and all reviewers who review manuscripts to first complete a simple one-day on-line training and comprehension test about these issues, similar to IRB training and testing. This could be like an abbreviated version of the PhD research-ethics training suggested in item #1 above. If any journal editors are reading this, what do you think?
3.) Authors submitting manuscripts to journals should be required to certify, on their honor, that the study does not include any results that were derived through data dredging ( http://en.wikipedia.org/wiki/Data_dredging ) or by ex post adaptation or “tuning” of hypotheses — i.e., in the same way that authors already must certify that the manuscript is not under consideration for publication at another journal. Again, if any journal editors are reading this, what do you think?
4.) As an author, don’t submit crap. As a reviewer or editor, don’t accept crap. As a letter-writer or tenure-review committee member, don’t reward crap.
If the editors of our leading journals and the directors of our leading PhD programs got together to agree on implementing these simple, common-sense, low-cost steps, then I bet the problem would diminish dramatically.
Anyway, that’s my second “rant du jour.” The first one is over at strategyprofs. Now I gotta get back to real work.
Thanks,
Rich
7.
Peter Klein | 29 November 2011 at 4:28 pm
Rich, the AER has an explicit policy about data and code sharing “for purposes of replication,” though it isn’t clear how this is working in practice: http://www.aeaweb.org/aer/data.php. Excerpt:
“It is the policy of the American Economic Review to publish papers only if the data used in the analysis are clearly and precisely documented and are readily available to any researcher for purposes of replication. Authors of accepted papers that contain empirical work, simulations, or experimental work must provide to the Review, prior to publication, the data, programs, and other details of the computations sufficient to permit replication. These will be posted on the AER Web site. The Editor should be notified at the time of submission if the data used in a paper are proprietary or if, for some other reason, the requirements above cannot be met.
“As soon as possible after acceptance, authors are expected to send their data, programs, and sufficient details to permit replication, in electronic form, to the AER office. Please send the files via e-mail to aeraccept@aeapubs.org, indicating the manuscript number. Questions regarding any aspect of this policy should be forwarded to the Editor.”
8.
Rich Makadok | 30 November 2011 at 10:25 am
Thanks, Peter. That sounds like another good addition to the list of simple, common-sense, low-cost steps for mitigating this problem. Again, would any journal editors out there care to comment? Joe?
9.
Joe Mahoney | 30 November 2011 at 11:47 am
Dear Rich — As an Associate Editor at SMJ, and as a Director of Graduate Studies, I support your recommendations, Rich. If you want to discuss off-line drafting some recommendations for our three editors to consider (Rich Bettis, Will Mitchell and Ed Zajac), I would be pleased to work with you. I also look forward to others input on ways that we can serve the public good better. Best regards, Joe Mahoney
10.
Peter Klein | 30 November 2011 at 5:00 pm
Great, thanks much to Rich and Joe! BTW, Rich, I want to go on record as giving full support to your #4. :)
11.
Rich Makadok | 30 November 2011 at 6:27 pm
That’s the easiest one to support.
What could possibly be the argument against it?
12.
David Hoopes | 1 December 2011 at 3:24 pm
I think a lot of people in the management field simply do not understand the statistical techniques they use. SRP got tired of hearing me complain about small sample sized used in structural equations analyses. At a more fundamental level, I think people learn to use canned programs but don’t understand the assumptions that go into the techniques. Network analysis, cluster analysis, even seemingly straightforward regression analysis are commonly misused. By misused I mean their use was originally predicated on certain well-known constraints. These constraints have for whatever reason relaxed over time (in some cases very quickly). So, I think in management, training is often poor.
13.
Robert Higgs | 27 December 2011 at 1:32 pm
Years ago, back in the 1970s as I recall, the Journal of Political Economy initiated the practice of publishing replications. Some of these papers were quite interesting, showing, for example, that using a different statistical package to do the regressions gave substantially different results. Yet, for whatever reason, the JPE ceased publication of such papers after a while. I suspect that people simply found that the credit they received from the profession and their own universities was insufficient to justify the often-great work of doing the replications.
The profession wants “fresh results,” “something new,” “something creative,” etc. People respond to these professional incentives. Surely many authors know that the statistical analysis they use is crap — completely inappropriate to the sort of data they analyze (in economics, generally data NOT obtained by random sampling). Yet, if one does not throw in the “tests of statistical significance” and the rest of the usual crap, one gets zapped by referees! So, people continue to do what they are rewarded for doing and to shun what they are punished for doing. Shocking, eh?
14.
Jim Rose | 2 January 2012 at 1:26 am
robert,
the better part of david card’s work on minimum wages was to point out the low quality of the prior econometrics.
See Leonard, Thomas C. (2000) “The Very Idea of Applying Economics: The Modern Minimum-Wage Controversy and its Antecedents.” In Roger Backhouse and Jeff Biddle (eds), Toward a History of Applied Economics, History of Political Economy, Supplement to Vol. 32, pp. 117-144.
Card and Krueger reviewed the established econometric literature and judge it as unreliable, tainted by publication bias and specification search.
They argued that time-series estimates, over time, involve larger and larger samples, which, ceteris paribus, should increase t-statistics.
If the additional data in newer studies are independent of the older data, “then a doubling of the sample size should result in an increase in the absolute value of the t-ratio of about 40 percent”.
Card and Krueger found that t-statistics are, in fact, declining in sample size