Темные данные: Практическое руководство по принятию правильных решений в мире недостающих данных

Примечания

Глава 1. Темные данные

. , accessed 16 April 2019.

. , accessed 16 April 2019.

. , accessed 16 April 2019.

. E. M. Mirkes, T. J. Coats, J. Levesley, and A. N. Gorban, “Handling missing data in large healthcare dataset: A case study of unknown trauma outcomes.” Computers in Biology and Medicine 75 (2016): 203-16.

. .

. D. Rumsfeld, Department of Defense News Briefing, 12 February 2002.

. , accessed 31 July 2018.

. .

. ; отчет комиссии Роджерса см. .

. R. Pattinson, Arctic Ale: History by the Glass, issue 66 (July 2102), , accessed 31 July 2018.

Глава 2. Обнаружение темных данных

. D. J. Hand, F. Daly, A. D. Lunn, K. J. McConway, and E. Ostrowski, A Handbook of Small Data Sets (London: Chapman and Hall, 1994).

. D. J. Hand, “Statistical challenges of administrative and transaction data (with discussion),” Journal of the Royal Statistical Society, Series A181 (2018): 555-605.

. , accessed 24 August 2018.

. M. E. Kho, M. Duffett, D. J. Willison, D. J. Cook, and M. C. Brouwers, “Written informed consent and selection bias in observational studies using medical records: Systematic review,” BMJ (Clinical Research Ed.) 338 (2009): b866.

. S. Dilley and G. Greenwood, “Abandoned 999 calls to police more than double,” 19 September 2017, , accessed 10 December 2017.

. M. Johnston, The Online Photographer, 17 February 2017, , accessed 28 December 2017.

. A. L. Barrett and B. R. Brodeski, “Survivorship bias and improper measurement: How the mutual fund industry inflates actively managed fund performance” (Rock-ford, IL: Savant Capital Management, Inc., March 2006), , accessed 28 December 2017.

. T. Schlanger and C. B. Philips. “The mutual fund graveyard: An analysis of dead funds,” The Vanguard Group, January 2013.

. .

. Knowledge Extraction Based on Evolutionary Learning, , accessed 22 September 2019.

. M. C. Bryson, “The Literary Digest poll: Making of a statistical myth,” The American Statistician 30 (1976): 184-5.

. , accessed 4 November 2018.

. Office for National Statistics: .

. R. Tourangeau and T. J. Plewes, eds., Nonresponse in Social Surveys: A Research Agenda (Washington, DC: National Academies Press, 2013).

. J. Leenheer and A. C. Scherpenzeel, “Does it pay off to include non-internet households in an internet panel?” International Journal of Internet Science 8 (2013), 17-29.

. Tourangeau and Plewes, Nonresponse in Social Surveys.

. H. Wainer, “Curbstoning IQ and the 2000 presidential election,” Chance 17 (2004): 43-46.

. I. Chalmers, E. Dukan, S. Podolsky, and G. D. Smith, “The advent of fair treatment allocation schedules in clinical trials during the 19th and early 20th centuries,” Journal of the Royal Society of Medicine 105 (2012): 221-7.

. J. B. Van Helmont, Ortus Medicinae, The Dawn of Medicine (Amsterdam: Apud Ludovicum Elzevirium, 1648), , accessed 15 June 2018.

. W. W. Busse, P. Chervinsky, J. Condemi, W. R. Lumry, T. L. Petty, S. Rennard, and R. G. Townley, “Budesonide delivered by Turbuhaler is effective in a dose-dependent fashion when used in the treatment of adult patients with chronic asthma,” Journal of Allergy and Clinical Immunology 101 (1998): 457-63; J. R. Carpenter and M. Kenward, “Missing data in randomised controlled trials: A practical guide,” November 21, 2007, , accessed 7 May 2018.

. P. K. Robins, “A comparison of the labor supply findings from the four negative income tax experiments,” Journal of Human Resources 20 (1985): 567-82.

. A. Leigh, Randomistas: How Radical Researchers Are Changing Our World (New Haven, CT: Yale University Press, 2018).

. P. Quinton, “The impact of information about crime and policing on public perceptions,” National Policing Improvement Agency, January 2011, , accessed 17 June 2018.

. J. E. Berecochea and D. R. Jaman, (1983) Time Served in Prison and Parole Outcome: An Experimental Study: Report Number 2, Research Division, California Department of Corrections.

. G.C.S. Smith and J. Pell, “Parachute use to prevent death and major trauma related to gravitational challenge: Systematic review of randomised controlled trials,” British Medical Journal 327 (2003): 1459-61.

. The Washington Post, “Test of ‘dynamic pricing’ angers Amazon customers,” October 7, 2000, , accessed 19 June 2018.

. BBC, “Facebook admits failings over emotion manipulation study,” BBC News, 3 October 2014, , accessed 19 June 2018.

Глава 3. Определения и темные данные

. .

. Цифры по иммиграции: , accessed 2 January 2018.

. Office for National Statistics: “Crime in England and Wales: Year ending June 2017,” , accessed 4 January 2018.

. J. Wright, “The real reasons autism rates are up in the U.S.” Scientific American, March 3, 2017, , accessed 3 July 2018.

. N. Mukadam, G. Livingston, K. Rantell, and S. Rickman, “Diagnostic rates and treatment of dementia before and after launch of a national dementia policy: An observational study using English national databases. BMJ Open 4, no. 1 (January 2014), , accessed 3 July 2018.

. .

. .

. Titanic Disaster: Official Casualty Figures, 1997, , accessed 2 October 2018.

. A. Agresti, Categorical Data Analysis, 2d ed. (New York: Wiley, 2002), 48-51.

. W. S. Robinson, “Ecological correlations and the behavior of individuals,” American Sociological Review 15 (1950): 351-7.

. G. Gigerenzer, Risk Savvy: How to Make Good Decisions (London: Penguin Books, 2014), 202.

. W. J. Krzanowski, Principles of Multivariate Analysis, rev. ed. (Oxford: Oxford University Press, 2000), 144.

Глава 4. Непреднамеренные темные данные

. S. de Lusignan, J. Belsey, N. Hague, and B. Dzregah, “End-digit preference in blood pressure recordings of patients with ischaemic heart disease in primary care,” Journal of Human Hypertension 18 (2004): 261-5.

. L. E. Ramsay et al., “Guidelines for management of hypertension: Report of the third working party of the British Hypertension Society,” Journal of Human Hypertension 13 (1999): 569-92.

. J. M. Roberts Jr. and D. D. Brewer, “Measures and tests of heaping in discrete quantitative distributions,” Journal of Applied Statistics 28 (2001): 887-96.

. .

. B. Kenber, P. Morgan-Bentley, and L. Goddard, “Drug prices: NHS wastes £30m a year paying too much for unlicensed drugs, Times (London), 26 May 2018, , accessed 26 May 2018.

. H. Wainer, “Curbstoning IQ and the 2000 presidential election,” Chance 17 (2004): 43-46.

. W. Kruskal, “Statistics in society: Problems unsolved and unformulated,” Journal of the American Statistical Association, 76, (1981): 505-15.

. Я не смог найти ясного происхождения этого закона. В своем президентском обращении 1979 г. к Королевскому статистическому обществу Клаус Мозер (“Statistics and public policy,” Journal of the Royal Statistical Society, Series A143 (1980): 1-32) говорит, что он был разработан Центральным статистическим управлением Великобритании. Эндрю Эренберг цитирует его как Закон Тваймана без указания источника (“The teaching of statistics: Corrections and comments,” Journal of the Royal Statistical Society, Series A138 (1975): 543-45).

. T. C. Redman, “Bad data costs the U.S. $3 trillion per year,” Harvard Business Review, 22 September 2016, , accessed 17 August 2018.

. ADRN, .

. , accessed 24 August 2018.

Глава 5. Стратегические темные данные

. :32004L0113, accessed 18 February 2019.

. M. Hurwitz and J. Lee, Grade Inflation and the Role of Standardized Testing (Baltimore, MD: Johns Hopkins University Press, forthcoming).

. R. Blundell, D. A. Green, and W. Jin, “Big historical increase in numbers did not reduce graduates’ relative wages,” Institute for Fiscal Studies, 18 August 2016, , accessed 23 November 2018.

. D. Willetts, A University Education (Oxford: Oxford University Press, 2017).

. R. Sylvester, “Schools are cheating with their GCSE results,” The Times (London) — 21 August 2018, , accessed 23 August 2018.

. “Ambulance service ‘lied over response rates,’” The Telegraph (London), 28 February 2003, , downloaded on 6 October 2018.

. , accessed 6 October 2018.

. .

. J. M. Keynes, General Theory of Employment Interest and Money (New York: Harcourt, Brace, 1936).

. BBC, 1 February 2011, , accessed 18 August 2018.

. Direct Line Group, 2014, , accessed 11 April 2014.

. A. Reurink, “Financial fraud: A literature review,” MPlfG Discussion Paper 16/5 (Cologne: Max Planck Institute for the Study of Societies, 2016).

. R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhahad, “Intelligible models for healthcare: predicting pneumonia risk and hospital 30-day readmission,” Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’15, Sydney, Australia, 10-13 August 2015, pp. 1721-30.

. Board of Governors of the Federal Reserve System, Report to the Congress on Credit Scoring and Its Effects on the Availability and Affordability of Credit, August 2007, , accessed 18 August 2018.

. E. Wall, “How car insurance costs have changed,” The Telegraph (London), 21 January 2013, , accessed 19 August 2018.

Глава 6. Умышленно затемненные данные

. V. Van Vlasselaer, T. Eliassi-Rad, L. Akoglu, M. Snoeck, and B. Baesens, “Gotcha! Network-based fraud detection for social security fraud,” Management Science 63(14 July 2016): 3090-3110.

. B. Baesens, V. van Vlasselaer, and W. Verbet, Fraud Analytics: Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection (Hoboken, NJ: Wiley, 2105), 19.

. “Crime in England and Wales: Year Ending June 2017,” , accessed 31 December 2017.

. D. J. Hand and G. Blunt, “Estimating the iceberg: How much fraud is there in the UK?” Journal of Financial Transformation 25, part 1(2009): 19-29, .

. Rates of fraud, identity theft and scams across the 50 states: FTC data,” Journalist’s Resource, 4 March 2015, , accessed 19 August 2018.

. B. Whitaker, “Never too young to have your identity stolen,” The New York Times, 27 July 2007, , accessed 3 February 2018.

. Javelin, 1 February 2017, , accessed 3 February 2018.

. III, “Facts + Statistics: Identity theft and cybercrime,” 2016, , accessed 3 February 2018.

. DataShield, 14 March 2013, , accessed 3 February 2018.

. A. Reurink; Chapter 5, Note 12.

. , accessed 30 September 2018.

. “Accounting scandals: The dozy watchdogs,” Economist, 11 December 2014, , accessed 7 April 2018.

. E. Greenwood, Playing Dead: A Journey through the World of Death Fraud (New York: Simon and Schuster, 2017).

. CBS This Morning, “Playing a risky game: People who fake death for big money,” , accessed 6 April 2018.

. M. Evans, “British woman who ‘faked death in Zanzibar in £140k insurance fraud bid’ arrested along with teenage son,” The Telegraph (London), 15 February 2017, , accessed 6 April 2018.

. S. Hickey, “Insurance cheats discover social media is the real pain in the neck,” The Guardian (London), 18 July 2016, , accessed 4 April 2018.

. P. Kerr, “‘Ghost Riders’ are target of an insurance sting,” The New York Times, 18 August 1993, , accessed 6 April 2018.

. FBI (N.A.), “Insurance Fraud,” , accessed 6 April 2018.

. E. Crooks, “More than 100 jailed for fake BP oil spill claims,” Financial Times (London), 15 January 2017, , accessed 6 April 2018.

. ABI, “The con’s not on — Insurers thwart 2,400 fraudulent insurance claims valued at £25 million every week,” Association of British Insurers, 7 July 2017, , accessed 4 April 2018.

. “PwC Global Economic Crime Survey: 2016; Adjusting the lens on economic crime,” 18 February 2016, , accessed 8 April 2018.

Глава 7. Наука и темные данные

. J. M. Masson, ed., The Complete Letters of Sigmund Freud to Wilhelm Fliess (Cambridge, MA: Belknap Press, 1985), 398.

. “Frontal lobotomy,” Journal of the American Medical Association 117 (16 August 1941): 534-35.

. N. Weiner, Cybernetics (Cambridge, MA: MIT Press, 1948).

. J. B. Moseley et al., “A controlled trial of arthroscopic surgery for osteoarthritis of the knee,” New England Journal of Medicine 347, no. 2 (2002): 81-88.

. J. Kim et al., Association of multivitamin and mineral supplementation and risk of cardiovascular disease: A systematic review and meta-analysis. Circulation: Cardio-vascular Quality and Outcomes11 (July 2018), , accessed 14 July 2018.

. J. Byrne, MD, “Medical practices not supported by science,” Skeptical Medicine, , accessed 14 July 2018.

. T. Kuhn, The Structure of Scientific Revolutions, 2d ed. (Chicago: University of Chicago Press, 1970), 52.

. J.P.A. Ioannidis, “Why most published research findings are false,” PLOS Medicine 2, no. 8 (2005): 696-701.

. L. Osherovich, “Hedging against academic risk,” Science-Business eXchange, 14 April 2011, , accessed 12 July 2018.

. M. Baker, “1,500 scientists lift the lid on reproducibility,” Nature 533 (July 2016): 452-54, , accessed 12 July 2018.

. C. G. Begley and L. M. Ellis, “Raise standards for preclinical cancer research,” Nature-Comment 483 (March 2012): 531-33.

. L. P. Freedman, I. M. Cockburn, and T. S. Simcoe, “The economics of reproducibility in preclinical research,” PLOS Biology, 9 June 2015, , accessed 12 July 2018.

. B. Nosek et al., “Estimating the reproducibility of psychological science,” Science 349, no. 6251 (August 2015): 943-52.

. .

. .

. F. C. Fang, R. G. Steen, and A. Casadevall, “Misconduct accounts for the majority of retracted scientific publications,” PNAS 109 (October 2012): 17028-33.

. D. G. Smith, J. Clemens, W. Crede, M. Harvey, and E. J. Gracely, “Impact of multiple comparisons in randomized clinical trials,” American Journal of Medicine 83 (September 1987): 545-50.

. C. M. Bennett, A. A. Baird, M. B. Miller, and G. L. Wolford, “Neural correlates of interspecies perspective taking in the post-mortem Atlantic Salmon: An argument for proper multiple comparisons correction,” Journal of Serendipitous and Unexpected Results 1, no. 1 (2009): 1-5, , accessed 16 August 2018.

. S. Della Sala and R. Cubelli, “Alleged ‘sonic attack’ supported by poor neuro-psychology,” Cortex 103 (2018): 387-88.

. R. L. Swanson et al., “Neurological manifestations among U. S. Government personnel reporting directional audible and sensory phenomena in Havana, Cuba,” JAMA 319 (20 March 2018): 1125-33.

. F. Miele, Intelligence, Race, and Genetics: Conversations with Arthur R. Jensen (Oxford: Westview Press, 2002), 99-103.

. C. Babbage, Reflections on the Decline of Science in England, and on Some of Its Causes (London: B. Fellowes, 1830).

. A. D. Sokal, “Transgressing the boundaries: Toward a transformative hermeneutics of quantum gravity,” Social Text 46/47 (Spring/Summer 1996): 217-52.

. , accessed 23 January 2019.

. A. Sokal and J. Bricmont, Intellectual Imposters: Postmodern Philosophers’ Abuse of Science (London: Profile Books, 1998).

. .

. .

. .

. C. Dawson and A. Smith Woodward, “On a bone implement from Piltdown (Sussex),” Geological Magazine Decade 6, no. 2 (1915): 1-5, , accessed 7 July 2018.

. M. Russell (2003) Piltdown Man: The Secret Life of Charles Dawson (Stroud, UK: Tempus, 2003); M. Russell, The Piltdown Man Hoax: Case Closed (Stroud, UK: The History Press, 2012).

. J. Scott, “At UC San Diego: Unraveling a research fraud case,” Los Angeles Times, 30 April 1987, , accessed 4 July 2018.

. B. Grant, “Peer-review fraud scheme uncovered in China,” Scientist, 31 July 2017, , accessed 4 July 2018.

. , accessed 14 October 2018.

. R. A. Millikan, “On the elementary electric charge and the Avogrado constant,” Physical Review 2, no. 2 (August 1913): 109-43.

. W. Broad and N. Wade, Betrayers of the Truth: Fraud and Deceit in the Halls of Science (New York: Touchstone, 1982).

. D. Goodstein, “In defense of Robert Andrews Millikan,” American Scientist 89, no. 1 (January-February 2001): 54-60.

. R. G. Steen, A. Casadevall, and F. C. Fang, “Why has the number of scientific retractions increased?” PLOS ONE 8, no. 7 (8 July 2013), , accessed 9 July 2018.

. D. J. Hand, “Deception and dishonesty with data: Fraud in science,” Significance 4, no.1 (2007): 22-25; D. J. Hand, Information Generation: How Data Rule Our World (London: Oneworld Publications, 2007); H. F. Judson, The Great Betrayal: Fraud in Science (Orlando, FL: Harcourt, 2004).

. D. J. Hand, “Who told you that?: Data provenance, false facts, and separating the liars from the truth-tellers,” Significance (August 2018): 8-9.

. LGTC (2015), , accessed 17 April 2018.

. Tameside, , accessed 17 April 2018.

Глава 8. Принцип работы с темными данными

. См., например: D. Rubin, “Inference and missing data,” Biometrika, 63, no. 3 (December 1976): 581-92.

. C. Marsh, Exploring Data (Cambridge: Cambridge University Press, 1988).

. X.-L. Meng, “Statistical paradises and paradoxes in big data (I): Law of large populations, big data paradox, and the 2016 U.S. presidential election,” Annals of Applied Statistics 12 (June 2018): 685-726.

. R.J.A. Little, “A test of missing completely at random for multivariate data with missing values,” Journal of the American Statistical Association 83, no. 404 (December 1988): 1198-1202.

. E. L. Kaplan and P. Meier, “Nonparametric estimation from incomplete observations,” Journal of the American Statistical Association 53, no. 282 (June 1958): 457-81.

. G. Dvorsky, “What are the most cited research papers of all time?” 30 October 2014, , accessed 22 April 2018.

. F. J. Molnar, B. Hutton, and D. Fergusson, “Does analysis using ‘last observation carried forward’ introduce bias in dementia research?” Canadian Medical Association Journal 179 no. 8 (October 2008):751-53.

. J. M. Lachin, “Fallacies of last observation carried forward,” Clinical Trials 13, no. 2 (April 2016): 161-68.

. A. Karahalios, L. Baglietto, J. B. Carlin, D. R. English, and J. A. Simpson, “A review of the reporting and handling of missing data in cohort studies with repeated assessment of exposure measures,” BMC Medical Research Methodology 12 (11 July 2012): 96, .

. S.J.W. Shoop, “Should we ban the use of ‘last observation carried forward’ analysis in epidemiological studies?” SM Journal of Public Health and Epidemiology 1, no. 1 (June 2015): 1004.

. S. J. Miller, ed., Benford’s Law: Theory and Applications (Princeton, NJ: Princeton University Press, 2015).

Глава 9. Полезные темные данные

. S. Newcomb “Measures of the velocity of light made under the direction of the Secretary of the Navy during the years 1880-1882,” Astronomical Papers 2 (1891): 107-230 (Washington, DC: U. S. Nautical Almanac Office).

. ADRN, .

. D. Barth-Jones D. “The ‘reidentification’ of Governor William Weld’s medical information: A critical re-examination of health data identification risks and privacy protections, then and now,” 3 September 2015, , accessed 24 June 2018.

. A. Narayanan and V. Shmatikov, “How to break the anonymity of the Netflix Prize dataset,” 22 November 2007, , accessed 25 March 2018; A. Narayanan and V. Shmatikov V. (2008) Robust deanonymization of large sparse datasets (how to break the anonymity of the Netflix Prize dataset), 5 February 2008, , accessed 24 June 2018.

. D. Hugh-Jones, “Honesty and beliefs about honesty in 15 countries,” 29 October 2015, , accessed 26 June 2018.

. C. Gentry, “Computing arbitrary functions of encrypted data,” Communications of the ACM, 53, no. 3 (March 2010): 97-105.

Глава 10. Классификация темных данных

. , accessed 27 October 2018.

. A. Cavallo, “Online and official price indexes: Measuring Argentina’s inflation,” Journal of Monetary Economics 60, no. 2 (2013): 152-65.

. A. Cavallo and R. Rigobon, “The billion prices project: Using online prices for measurement and research,” Journal of Economic Perspectives 30, no. 2 (Spring 2016): 151-78.

. C. Szegedy et al., “Intriguing properties of neural networks,” , 19 February 2014, accessed 23 August 2008.

. M. Sharif, S. Bhagavatula, L. Bauer, and M. K. Reiter, “Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition,” October 2016, , accessed 23 August 2018.

Показать оглавление

Комментариев: 0

Оставить комментарий