「存活者偏差(survivorship bias)」
来自:苏仁(满满的生命力Vitality~)
我之所以认为国际经济与贸易能赚钱,是不是一种Survivorship bias 因为这里最为接近金钱与财富,导致有太多“幸存者(富翁)”的案例,但是我忘记了在这片土地上有无数人在苦命挣扎,同理我们是否对金融专业太过自信。 [认识十年的男朋友最后分了手,好像是个很悲剧的事。但是换个角度想,他可能是当初陪伴你,让你可以安心考研学习的人,他可能是那个促成你离开故乡,到大城市闯荡原因,你取得今天的成就,很多因素可能是因为他不经意的促成,他即使不是陪你终老的人,也是你的命运派来渡你的人,最后两个人分手了,这说明他在你生命中所能承担的角色和任务也完成了,未来大家各有际遇,命运用它自己的方式把它为你准备的礼物交到你的手中,而且这礼物并不薄。虽然它总是给你一些,却偏不肯给你另一些,但我想那是因为它冥冥中希望你明白一些道理,这些道理不能靠任何人帮你想,只有靠你自己的悟性。可是如果你不肯想,想不通的话,就永远打不开另一扇门,永远在原地兜兜转转,不能进退。] Survivorship bias http://en.wikipedia.org/wiki/Survivorship_bias From Wikipedia, the free encyclopedia Jump to: navigation, search Survivorship bias is the logical error of concentrating on the people or things that "survived" some process and inadvertently overlooking those that didn't because of their lack of visibility. This can lead to false conclusions in several different ways. The survivors may literally be people, as in a medical study, or could be companies or research subjects or applicants for a job, or anything that must make it past some selection process to be considered further. Survivorship bias can lead to overly optimistic beliefs because failures are ignored, such as when companies that no longer exist are excluded from analyses of financial performance. It can also lead to the false belief that the successes in a group have some special property, rather than being just lucky. For example, if the three of the five students with the best college grades went to the same high school, that can lead one to believe that the high school must offer an excellent education. This could be true, but the question cannot be answered without looking at the grades of all the other students from that high school, not just the ones who "survived" the top-five selection process. Survivorship bias is a type of selection bias. Contents 1 In finance 2 As a general experimental flaw 3 See also 4 References [edit] In finance In finance, survivorship bias is the tendency for failed companies to be excluded from performance studies because they no longer exist. It often causes the results of studies to skew higher because only companies which were successful enough to survive until the end of the period are included. For example, a mutual fund company's selection of funds today will include only those that are successful now. Many losing funds are closed and merged into other funds to hide poor performance. In theory, 90% of extant funds could truthfully claim to have performance in the first quartile of their peers if the peer group includes funds that have closed. In 1996 Elton, Gruber, & Blake showed that survivorship bias is larger in the small-fund sector than in large mutual funds (presumably because small funds have a high probability of folding).[1] They estimate the size of the bias across the U.S. mutual fund industry as 0.9% per annum, where the bias is defined and measured as: "Bias is defined as average a for surviving funds minus average α for all funds" (Where a is the risk-adjusted return over the S&P 500. This is the standard measure of mutual fund out-performance). Additionally, in quantitative back-testing of market performance or other characteristics, survivorship bias is the use of a current index membership set rather than using the actual constituent changes over time. Consider a backtest to 1990 to find the average performance (total return) of S&P 500 members who have paid dividends within the previous year. To use the current 500 members only and create an historical equity line of the total return of the companies that met the criteria, would be adding survivorship bias to the results. S&P maintains an index of healthy companies, removing companies that no longer meet their criteria as a representative of the large-cap U.S. stock market. Companies that had healthy growth on their way to inclusion in the S&P 500, would be counted as if they were in the index during that growth period, when they were not. Instead there may have been another company in the index that was losing market capitalization and was destined for the S&P 600 Small-cap Index, that was later removed and would not be counted in the results. Using the actual membership of the index, applying entry and exit dates to gain the appropriate return during inclusion in the index, would allow for a bias-free output. [edit] As a general experimental flaw Survivorship bias (or survivor bias) is a statistical artifact in applications outside finance, where studies on the remaining population are fallaciously compared with the historic average despite the survivors having unusual properties. Mostly, the unusual property in question is a track record of success (like the successful funds).[citation needed] For example, the parapsychology researcher Joseph Banks Rhine believed he had identified the few individuals from hundreds of potential subjects who had powers of ESP. His calculations were based on the improbability of these few subjects guessing the Zener cards shown to a partner by chance.[citation needed] A major criticism which surfaced against his calculations was the possibility of unconscious survivor bias in subject selections. He was accused of failing to take into account the large effective size of his sample (all the people he didn't choose as 'strong telepaths' because they failed at an earlier testing stage). Had he done this he might have seen that from the large sample, one or two individuals would probably achieve the track record of success he had found purely by chance. (Similarly, many investors believe that chance is the main reason that most successful fund managers have the track records they do.)[citation needed] Writing about the Rhine case, Martin Gardner explained that he didn't think the experimenters had made such obvious mistakes out of statistical naiveté, but as a result of subtly disregarding some poor subjects. He said that without trickery of any kind, there would always be some people who had improbable success, if a large enough sample were taken. To illustrate this, he speculates about what would happen if one hundred professors of psychology read Rhine's work and decided to make their own tests; he said that survivor bias would winnow out the typical failed experiments, but encourage the lucky successes to continue testing. He thought that the common null hypothesis (of no result) wouldn't be reported, but: "Eventually, one experimenter remains whose subject has made high scores for six or seven successive sessions. Neither experimenter nor subject is aware of the other ninety-nine projects, and so both have a strong delusion that ESP is operating." He concludes: "The experimenter writes an enthusiastic paper, sends it to Rhine who publishes it in his magazine, and the readers are greatly impressed". If enough scientists study a phenomenon, some will find statistically significant results by chance, and these are the experiments submitted for publication. To combat this, some editors now call for the submission of 'negative' scientific findings, where "nothing happened."[citation needed] Survivorship bias is one of the issues discussed in the provocative 2005 paper "Why Most Published Research Findings Are False."[2] I would speculate, based on what i know of biotechnology, that the claimed role of patents as an aid to innovation, is an example of survivorship bias. We hear, loudly, companies tht make a lot of money from patents, but we don't hear from the failed or never started projects. One can see this clearly in academic research; methods that are not protected (two hybrid, GFP) become popular, because, unprotected by patents, innovation can flourish. however, i don't have any data suitable for wiki pedia, these are just my thoughts. I agree that the Rhine section could be re worked to make clearer that the claimed role of patents as an aid to innovation 声称专利作为创新的辅助作用 We hear, loudly, companies tht make a lot of money from patents, but we don't hear from the failed or never started projects. 我们听到了公司THT专利了很多钱,但我们没从失败或从未开始项目听到过 One can see this clearly in academic research; methods that are not protected become popular, because, unprotected by patents, innovation can flourish. 人们可以看到,在学术研究,这显然未受专利保护成为流行,因为,无保护的方法,创新,才能蓬勃发展。 先讲一个故事: 1941年,第二次世界大战正打得如火如荼。有一天,美国哥伦比亚大学著名统计学家沃德(Abraham Wald) 遇到了一个意外的访客, 那是英国皇家空军的作战指挥官。他说:「沃德教授,每次飞行员出发去执行轰炸任务,我们最怕听到的回报是:『呼叫总部,我中弹了!』请协助我们改善这个攸关飞行员生死的难题吧!」沃德接下这个紧急研究案,他受委托分析德国地面炮火击中联军轰炸机的资料,并且以统计专业,建议机体装甲应该如何加强,才能降低被炮火击落的机会。但依照当时的航空技术,机体装甲只能局部加强,否则机体过重,会导致起飞困难及操控迟钝。沃德将联军轰炸机的弹着点资料,描绘成两张比较表,沃德的研究发现,机翼是最容易被击中的部位,而飞行员的座舱与机尾,则是最少被击中的部位。沃德详尽的资料分析,令英国皇家空军十分满意。 那么我想请问各位,我们是不是就应该在最容易被击中的部位装上装甲呢?你的Yes或No的原因又是什么呢?我们接下去看: 在研究成果报告的会议上,就这个问题发生了一场激辩。负责该项目的作战指挥官说:「沃德教授的研究清楚地显示,联军轰炸机的机翼,弹孔密密麻麻,最容易中弹。因此,我们应该加强机翼的装甲。」沃德客气但坚定地说:「将军,我尊敬你在飞行上的专业,但我有完全不同的看法,我建议加强飞行员座舱与机尾发动机部位的装甲,因为那儿最少发现弹孔。」在全场错愕怀疑的眼光中,沃德解释说:「我所分析的样本中,只包含顺利返回基地的轰炸机。从统计的观点来看,我认为被多次击中机翼的轰炸机,似乎还是能够安全返航;而飞机很少发现弹着点的部位,并非真的不会中弹,而是一旦中弹,根本就无法返航。」指挥官反驳说:「我很佩服沃德教授没有任何飞行经验,就敢做这么大胆的推论。就我个人而言,过去在执行任务时,也曾多次机翼中弹严重受创,要不是我飞行技术老到,运气也不错,早就机毁人亡了。所以,我依然强烈主张应该加强机翼的装甲。」这两种意见僵持不下,皇家空军部部长陷入苦思:他到底要相信这个作战经验丰富的飞将军, 还是要相信一个独排众议的统计学家?由于战况紧急,无法做更进一步的研究,部长决定接受沃德的建议,立刻加强驾驶舱与机尾发动机的防御装甲。不久之后,联军轰炸机被击落的比例,果然显著降低。为了确认这个决策的正确性,一段时间后,英国军方动用了敌后工作人员,搜集了部份坠毁在德国境内的联军飞机残骸,他们中弹的部位,果真如沃德所预料,主要集中在驾驶舱与发动机的位置。看不见的弹痕最致命!! 乍看之下,作战指挥官加强机翼装甲的决定十分合理,但他忽略了一个事实:弹着点的分布,是一种严重偏误的资料,因为最关键的资料,其实是在被击落的飞机身上,但这些飞机却无法被观察到。因此,布满了弹痕的机翼,反而是飞机最强韧的部位。空军作战指挥官差点因为太重视「看得见」的弹痕,反而做出错误的决策。这个案例有两个特别值得警惕的地:第一,死掉或被俘的人无法发表意见,搜集更多资料,并不会改善决策品质。由于弹痕资料的来源本身就有严重的偏误,努力搜集更多的资料,恐怕只会更加深原有的误解。第二,召集更多作战经验丰富的飞行员来提供专业意见,也不能改善决策品质,因为这些飞行员,正是产生偏误资料过程中的一环。他们都是安全回航的飞行员,虽然可能有机翼中弹的经验,但都不是驾驶舱或发动机中弹的「烈士」。简单的说,当他们愈认真凝视那些「看得到」的弹痕,他们离真相就愈远。 信息界有所谓「Garbage In, Garbage Out」,前提若是错误,再漂亮的统计方法和再多的资料,也不能让后面的推论变得正确。在管理实务与日常生活中,许多关键的资料,也像上述轰炸机的个案一样,会因为「失败」而观察不到——这即是一个对「存活者偏差(survivorship bias)」最生动贴切举例说明。如果有一位70岁的老人在电视上说,他就是靠每天抽一包烟、喝一斤酒才能长寿,那么请你想起「死人没法上电视说话」这件事! 同样的道理,不是那个地方长寿的老人家吃或喝某东西,某东西就是养生圣品。 再看一个骗钱的例子(这已经进化到E-mail版) 1月2日你接到一封匿名信,向你表示,这个月市场会上涨,结果市场果然上涨,但你 不以为意,因为大家都知道有元月效应这回事 (历年来一月间股价涨多跌少)。 到了2月1日,你又接到另一封信,向你表示,市场将下跌。这一次,又给那封信说中 了。 3月1日再接到一封信,情形一样。7月,你对那位匿名人士的先见之明很感兴趣,对方 邀你投资某个海外基金。 于是你把全部的储蓄拿出来投资,两个月以后,那些钱有如肉包子打狗,一去不回。 你伏在邻居的肩膀上嚎啕大哭,他告诉你,他也接过两封这种神秘信,但寄到第二封 就停了。 他说,第一封信的预测正确,但第二封不正确。 这是怎么一回事? 那些骗子玩的把戏是,他们从电话簿找出一万个人名,寄出后市看涨的信给其中一半 的人,后市看跌的信给另一半的人。 一个月后,将有五千人接到的信预测正确,然后再针对这五千人如法炮制。 再一个月后,剩下二千五百人接到的信预测正确,如此直到名单上剩下五百人,其中 会有两百人受骗上当, 因此骗子只要花几千美元的邮资,便可赚进数百万美元。 把手法作些改变。 某骗子假装投顾老师招收会员,跟你说你可以先加入一般会员,等你觉得准了再加入 VIP会员。这改变更巧妙的地方在于,骗子一开始就能赚到钱,此外VIP会员还会帮骗子 建立口碑,证明骗子有多准:存活者偏差(survivorshipbias)。 只要信息不流通,其它人不知道这假的投顾老师有多么(不)准。 ———————————————————————————————————————————— 在我们的思维逻辑里,成功企业总有成功的原因的,于是将其归纳总结。 但我们往往不会去关注那些,和成功企业在同一年代成长,最后相继消亡的那些,为什么会消亡。 研究成功者,你能看到美艳的皮肤。 研究失败者,你能看到生死的命门。 【隔壁有太多百万富翁】 http://book.douban.com/review/2027195/ 幸存者认识偏差(survivorship bias)中的情感效应:我们人类的天性就是无法变得更加理智,或在社会的轻视面前保持情绪不受影响,至少在我们目前的遗传密码中没有这么一条。理智思考不会带来任何安慰。 P139 双重幸存者认识偏差:《隔壁的百万富翁》中第一个幸存者认识偏差是他们只看到胜出者,但他们根本没打算对这样的统计结果做矫正。他们只字未提有些“积累者”积累的东西不对头。书中没有一处提到,有些人投资在胜出者身上凭的是运气;毫无疑问这些人会在这本书里找到位置。可以有一种办法来纠正这种偏差:把百万富翁的平均财富值削减下去,比方说,削减50%。这样做的依据是,由于有这些偏差,被观察的百万富翁的平均财富被拔高了许多(而现在则是把失败者的效应考虑进去)。这样做了以后,结论肯定会不一样。至于第二点,事情集中在历史上的一段非常时期,资产价格经历了历史上最强劲的牛市,自1982年以来,在股票中投资的每个美元平均差不多翻了2O倍,这里说的还只是股票的平均价格,抽样数字中有些人投资的股票可能还表现得高于平均数字。实际上每个观察对象都因为资产价格膨胀而发了财,也就是说因1982年以来金融证券和资产的价格膨胀而发了财。然而一个在不那么辉煌的市场时期采用同样策略的投资人肯定会念出一本不同的经。 忽略幸存者认识偏差的这种错误是个顽症,甚至(也许应该说尤其是)专业人员也是如此。为什么?因为我们受到的训练就是要利用摆放在我们而前的信息,而对我们看不见的就忽略不计。我们倾向于把所有可能的随机历史中真正实现了的那个当成最具代表性的一个,而忘记了还会有其他的。总而言之,幸存者认识偏差意味著,表现最突出的实例最为人所注目,为什么?因为失败者根本不被显示。 幸存者偏差、数据采掘(data mining)、数据探察(data snooping)、过度适应(over fitting)、回归平均值(regression to the mean)等等名称都是它的一些变种,基本上指的是这样一种情况,由于观察者对随机性的重要性没有正确的见解,所以把业绩夸大了。很明显这个概念有些令人不安的副作用。它会延伸到更普通的一些情景中,在那里随机性可以起一定的作用,比如对治疗方式的选择,或对偶然事件的解释等。 金融领域为什么这么热门?因为在这个领域里我们的信息很多(以大量的价格系列为体现),却不能像在物理学中那样去做实验。它的那些突出的缺陷就体现在对过去资料的依赖上。 有些人幸存下来是因为他们的个性正好符合某一既定的随机结构。 全无本事的投资人P150,第一个反直觉的结论是,从完全由不称职经理组成的一群人里,会产生出少量业绩记录出色的人。如果这一群人完全是由从长远来看注定要赔钱的人构成的,结果也不会有多大变化.因为存在着易变性,所以他们之中的一些人肯定会赚钱。这种易变性实际山谷对糟糕的投资决策有利。第二个反直觉结论是,我们所关心的问题,也就是业绩纪录的最大预期值(expectation of the maximum),取决于初始果样的规模大小,而不是每个经理的个人运气。换句话说,在一个给定的市场中,我可以找到多少位有出色业绩记录的经理,远远取决于在这项投资业务一开始的时候有多少人参与进来(而没有去读牙科学校),而不在于他们产生利润的能力。它也取决于市场的易变性(volatility)。为什么我要说“最大预期值”呢?因为我对业绩记录的平均数毫不关心,我只要看经理中最好的,不要看所有的经理。这意味着,只要1997年参加进来的人数比1993年多,那么到了2002年我见到的“优秀经理”就比1998年多。 人们以为他们能从自己看到的采样中总结出分布规律。在完全取决于最大值的问题上,我们推论出来的完全是另外一种分布,最佳表现者的分布。这种分布的平均数和胜出者及失败者无条件分布之间的差异,我们称为幸存者认识偏差(the surviiorship bias)。在这里就是指,一开始就参加采样的人当中有大约3%的人会连续五年赚钱这个事实。另外,时间会把随机性的恼人效果消除掉。当我们向前看,尽管这些经理在过去五年里都是赢利的,我们可以预料他们在未来任何一个时间段里都有可能转为不赔不赚。他们最后不会比那些在练习一开始就被淘汰出局的人表现得更好。 没有人肯承认在自己的成功里有随机性因素,只有在他的失败中才有随机性。 幸存者偏差依赖于一开始参与的人数。因此,一个人过去赚到过钱这个信息本身,既没有意义也不相干,我们需要知道他来自于多大的一个群体。如果我们不知道一共有多少经理尝试了又失败了,我就没有办法评判业绩记录的有效性。 我们对巧合现象的分布有理解上的偏差。 如果一项投资的成功完全是由随机因素造成的,那么它上门来找你的概率就很高。经济学家和保险公司的人把这种现像叫做逆向选择。由于存在这种选择上的偏差,因此,判断一项找上门来的投资,比判断一项你想寻找的投资,需要更严格的标准。 我容易把书评与最佳书的书评弄混。书评本应是对书的质量的一种评价。这里同样有幸存者偏差这个问题。我把一个变量的最大值的概率分布错认成了变量本身的分布。除了最佳评语以外,出版商决不会把任何别的东西放到封套上去。有些作者甚至走得更远,他们从不温不火,甚至是从不客气的书评中挑出一些字句来,使它看起来像是对这本书的赞誉。 在历史上,医学的发展是个试验和犯错误的过程,换句话说,医学是根据统汁数字来发展的。在病症和它的治疗方法之间可以是完全偶然的关系,有些药物在医疗试验中成功也完全是随机原因。医学科研人员许不很少有懂统计学的;统计学家很少有去做医学研究的多医学科研人员甚至对这种认识偏差丝毫都不察觉。 一系列随机实验不一定非得毫无格局形式可言才可称得上是随机。事实上,如果数据真的表现为完全没有任何格局形式,反而成了极其可疑,看起来更像是人为的。单独一次随机实验肯定会显示出某种格局形式。 人们把发现某种不存在与不存在某种发现混淆了起来。没有事情发生这一事实本身就可能是一条重大信息。恰如福尔摩斯在银色火焰案中所注意到的那样,奇怪就奇怪在那条狗没有发出吠叫。更加成问题的是,有许多科学成果没有能够得到发表,是因为它们没有统计学意义,但这不等于它们没有提供信息。 在不具备更多信息的条件下,我宁愿对自己的判断加以保留。这样比较保险。 http://www.douban.com/note/185393305/
最新讨论 ( 更多 )
- What is Equity Research:一篇关于行研的好文章 (苏仁)
- 我们工作到底为了什么(写得太好了,忍不住转载) (南小夕)
- 韩国的国有企业及其管理 (苏仁)
- 转自新浪微博:泊小豆 《爷爷和我》写得很棒,每个人的生命里... (miaomiao)
- 真爱是什么呢 (花花 see)