Data Overload: Data, Journalism, & COVID-19

Posted in Home Furnishings, Local journalism, Uncategorized on November 5th, 2020
Tags: , , , , ,

So again, welcometo the webinar. My name’s Todd Wallack. I’m expend the year asa Berkman Klein Nieman fellow at Harvard and I’ve beena data journalist at The Boston Globe for about sevenyears, as well as an investigativereporter acting both with the BostonGlobe’s Spotlight Team and the rest of the newsroom. Caroline Chen covershealth care for ProPublica, and she previously was areporter at Bloomberg News. Armand Emamdjomeh is agraphics assignment editor for The Washington Post, and he was previously a deputy director ofdata visualization at The Los Angeles Times. And I was provoked tohave this group of people talk about the issuesthat reporters face dealing withdata because we all have some different expertise. I’m sort of a generalistlooking at data trying to find, mine it for all sortsof types of storeys. Caroline is more of aspecialist in health care and will have more expertisein health care data. And Armand has lots ofexperience in visualizing data.And it seems like there’s beentons of interest and challenges in looking at COVID1 9 data. Parties have been trying totrack it by era and time to see trends and whether it’sgetting worse or better, as well as geographically. But there also have beena lot of challenges, such as handicaps tryingto obtain the data. So you’ve seen a lot ofheadlines about that, particularly at a locallevel, or getting more details on the data, suchas on race or senility or other detailsabout people affected.And questions about the accuracyand reliability of the data. ProPublica and The NewYork Times and others have written a lot ofstories raising questions about the accuracyand the challenges equating one country to anotherbecause of difficult deviations in who’s measured, howaccurate the tests are, how accurate fatality countsare, and other issues. So I’m going to start byasking a number of questions, and at about 12:35, will switchto questions from the gathering. So feel free to start tossingin questions as we speak. And about halfwaythrough, we’ll start going to audience questionsand finish at 1:00. OK. I want to startoff just by requesting Armand and Caroline, what data are you understand books most interested in? Could I hop-skip in now? I remember early on thequestions were just where is the disease spreading, right? So I thoughts certainly, especially in the US, as the virus firststarted to punched, everybody just wantedto know case tallies. And then I think thatstarted to soon overlay with a concern about deaths.And so I think that continuesto be of interest, cases and fatalities. And then I would say whereare the tests and testing abilities, testing capacities. And I remember now there’s anunderstanding that there are two types of tests. There’s the diagnostic tests. Those are the PCR, the swab tests. And now the newincoming antibody measures. And I belief the moresophisticated readers are starting to gain anunderstanding of what do the numbers symbolize aswe’re starting, just like right now, this week, starting to see studies comeout with some digits around these antibody studies.And there are alreadyfurious debates around those study resultsand whether or not those are meaningful. So I accompany those aslayering, right? We continue to want tocare about case weighs. And we continue to wantto care a lot about extinction and you know, segmentation of those. So demographics and raceand who are being affected. And these are layering as we go. And Armand, what have you seen? Yeah, I completelyagree, obviously along the same linesof that structure. It has been casecounts and where there were reported cases andreported outbreaks, you know, demises. And now, I conceive the one thingI can add to what Caroline said, is there’s been interest in thetrends that are being reported by states as well. So we’ve earn steps toshow what does this data look like over day, of coursenoting the caveats in the data and how it’s being reportedand recorded by the states. Oh, and I forgot tomention, of course, there’s the wholeconversation around supplyings, PPE, ventilators.It’s obviously been very hard. That’s always a movingtarget depending on whether you’re talking alocal level, national level. You can never place a nailon how much PPE there is at a rendered hospital, regime, but there is always interest in that question of renders. And I’m also inquisitive abouthow easy or difficult has it been to obtain allthe data for your stories and graphics? I can talk about that. I anticipate, you know, thedata at a national level is basically nonexistent. Most things are reportedat the territory stage, so that makesmeans know you have to either rely on an aggregatoror aggregate the data yourself, you are familiar with, going to all thesedifferent state sites, figuring out where they report it, howthey report it, in what format.Also noting that this data, what the states are reporting, likewise changes over timeand what programmes they’re using to report it. I is sending out a tweeta few days ago that was like, what if youwere reporting a live ballot, but you were building yourrig for reporting the research results as the election was happening? And what everywherewas reporting as the resultswas also changing, and they were changing howthey report it as well. So it has been extraordinarilydifficult in that sense, to build things thatdon’t constantly crack, and construct dataflows that are actually kind of stable, given the factthat what is being reported is moving under them as well. Yeah, I would say that thereare certain things that, only by the natureof the pandemic, are going to beconstantly changing.So for example, testing capacity. I’ve done a lot of reportingaround testing and testing capacity. And simply by the natureof what’s happening, that is changing perpetually. So whether that’s nationally, whether that’s locally, if you’re trying tosay what is the testing faculty of mystate, that count is going to beconstantly changing. And it should be, right? Because we have been constantlyramping up testing ability. So for any reporter totry to get a beat on that, on trying to informtheir regional books, they’d have toconstantly update that. Is it possible toactually get an accurate number at any point in time? I think that istechnically possible, but your number’s going to beoutdated like within an hour, even at a specific lab. So I have, atcertain points, been able to be like, Inailed the amount. It’s already old-fashioned. Is there any pointin even doing that? Yeah, I think there is.It’s a deserving exerciseto try to get a ballpark and to move vogues for readers. So there have beentimes where I’ve tried to do that forspecific legends, but it is afrustrating exercise, and I’ve reallyencouraged other reporters to truly try toexplain to readers where you got this number, whathas gone into this number, and how long of a shelf lifethe multitude will have, and truly try to show yourwork to your readers more than I ordinarily would. So I think some ofthat is inherent.There are other things, though, where you can only your report is only asgood as where you get it from. So for example, the WHO putsout daily situation reports. That is the onlyway you can really get a source for internationalcase countings, right? But the WHO’s information isonly as good as the countries from which it comes from.So I remain repeatedlyexplaining that is something that people, that the WHO has arecommended way for what they count as a positive occurrence. And they say thatit is if you research positive with a PCR located evaluation. For the longest time, Chinajust decided that they were only going to count as positivesomeone who had a positive PCR test and manifestations. They were not countingpeople who had a positive PCR case but no evidences. So they weren’t countingasymptomatic instances. There’s nothing theWHO can do about that. There’s nothing anybodycan do about that. And then after a while, trying to change that. So you needed to knowthat about China. And I convey, that’sdeeply frustrating. You can’t get everybodyto report the same way. And you need to have thosecaveats in your reporting. And this also percolates down tolike 50 territories, or 56 states and countries, all doingit in their same highway as well. Right. That’s got to draw comparisonsreally ticklish, when everyone has a different way of reportingthe data, tracking the data.There are differentrules on who gets researched and what gets counted. It sure does. And again, I thinkyou can only be clearly defined the caveatsof this is entirely dependent on what’sbeing reported and how it’s being reported. Yeah, I’ve been very, verycautious about comparings. Got it. And are there any other problemsyou’ve noted in the data that people should be aware of? I have been very careful or I’ve been encouragingreporters in my newsroom and trying toexplain to the public precisely are well aware of what thedefinitions are of numbers that do thrown around. So one thing, forexample, I’ve been trying to explain a lot tolay readers is what actually is the fatality rate, right? And there’s a big divergence, I suppose, between what the public wants to know, which is, you know, if I get infected will I die, and what is reported as thecase fatality rate, right? So the speciman fatalityrate is the number of reported demises divided bythe number of lab confirmed infections.So everybody knows in theUS, it had been really hard it continues to be reallyhard in numerous places to get tested inthe first place. And a lot of places arenot experimenting unless you’re really, really sick. So that denominator isgoing to be much smaller than the actualnumber of infections. So extremely early onin the United State, the lawsuit fatality ratewas something like 10%. Because we just weren’ttesting a lot of people. And “youre thinking about” itas an iceberg mannequin. Like, the deaths areusually the easiest to find and count, especiallyearly on in a pandemic. This always happensin a pandemic. And the people who areasymptomatically fouled are the hardest to findin the first place.But again, theaverage lay reader, they just want to know ifI get infected, will I die? And they’re lookingat that count that’s been reported in yourheadline, and they’re just looking at that and beinglike, if I get infected, that’s my luck of dying. And “were having” such a hugeresponsibility as reporters to explain that number andnot just throw things around in headlines. So I think there are alot of numbers like this. As a science andhealth reporter, I feel like we have alot of responsibility to explain to parties, so that are not and are. These are the rate ofinfection, the probability you have of infectingother parties, the average number ofpeople you’ll infect. It’s a process ofunderstanding, and this is what I’m trying toget across to my books. There is so much we are stilllearning that we don’t know yet, and we cannot presentthis as set in stone. And building on that, mentioning all cases being kind of a difficultfraction to divide against.Like, the deaths numberalso is slippery. We’ve seen tales highlightingthis in recent dates, and it’s beensomething we’ve been various kinds of saying for a while. It’s like not everydeath is being accurately categorized extremely. Recently New York Cityadded some what was it? Like 3,700? I forget the exact number ofdeaths that were classified as probably COVID1 9. And you are familiar with, if it’shappening in New York City, probably there issome fraction of cases that’s being categorizedthroughout or never even recorded. So that number isslippery as well. And I study when talking aboutfatality rates and that kind of thing, rather than justtalking about one big-hearted multitude, we’ve been trying to, whenwe have the data available, to at least break itdown a little bit better into segments of thepopulation or report the comorbidities thatstudies have been reporting. So it’s not like a flat 3.2% or whatever it would be.It depends on alot of factors that are related to the individual. It unquestionably soundschallenging when there are questionsabout and uncertainty about both the numeratorand the denominator when you’re tryingto calculate frequencies. My sense is thatthese are problems that data reporters, andjournalists in general, encounter whentrying to get data. It’s often hard toget one clean database at a national levelor global level, where often aggregating itfrom lots of different places.And each lieu mighthave different ways of counting the numbersand reporting the numbers. And the data can be messy. Is there anything differentthat you’re finding in dealing with COVID1 9 data? Or does it reflectchallenges you’ve faced doing other types of floors? This is more theoretical. But you know, thesenumbers are being reported by states and by countriesand everywhere very precisely, but in the specific nature, it’sa unusually imprecise weigh. So there is thisweird situation. You know, inaccuratebut precise is one type of data classification. And I think that’swhere we are now. It’s like you’re throwing you’re taking shotsat a dartboard, and they’re all disembark in avery precise, same same lieu, but you’re off somewhere. You’re not actuallyhitting the dartboard. It’s like somewhereoff of the wall because you’re propelling thedarts various kinds of blindfolded. But we have very precise tallies. Yeah. One thing I’ve seen, and I predict my I know this is areally hard thing to do, peculiarly if you havean editor that’s pushing you, is to resist the implore to write.Because what I do see isthat health agencies, as they liberate data, are refining as they become. And I think this is becausethey are also figuring out what they need to release. So for example, to givea very specific example, New York City startedout by render test they were onlyreporting by burrough. And then a lot ofpeople were like, well, that’s not enough information. And they were gettinga lot of criticism. And then theystarted releasing it wasn’t fairly by vicinity. It was by this very strangenot quite zip code , not quite vicinity. It was percentage ofpositive, but they didn’t have any fresh counts. There were no numerator, no denominator. It was percentage. And I was like, well, Ican’t is everything with that. Because if you say that inthis zone, it was 66% positive, that could mean that youonly did three tests there, and two people tested positive.That’s meaningless. But I did examine somenews organizations write a story on that. And I was like, that’sa bit perilous. And then, I thinklike within a week, they then rereleased crowds, who the hell is by zip code and had numeratorand denominator. They had acces more information. And then you could writea more meaningful story. And then New YorkCity has continued to update anditerate and make more and more granular information. So I do think that there isa benefit to kind of waiting. Because I’ve seen, more than I’ve ever seen before in any otheroutbreak I’ve flooded, sort of healthdepartments iterate as they entered into with the datathat they’re releasing. And I actually realise, because this is happening across the country, actually reporters, I recall, be able to push healthdepartments and be able to say, hey, you are familiar with, Ohioreleased this information.Florida, why aren’t youreleasing the information obtained? And be able to sort of pushdepartments off of each other. And I think in asimilar theme, I think it’s really sometimesdangerous to write a story off of a preprint. I do think it’s reallygreat that scientists are, investigates are moving quicklyand sharing information on MetaArchive andbioRxiv and not waiting to go throughthat whole process. But then it’s notpeer reviewed, right? So this positions you in areally dangerous position as a reporter to haveto write a story off of a nonpeer reviewed study.So I recall one of mygoals is to never cause a preprint walkalone, as in you don’t write a story on apreprint by itself. You try to let it go inconcert with other studies and look for a trend. Or at least let parcels andlots and a lot of people comment on it, and don’tjust write a story on this. So this is happening rightnow with all these antibody studies, right? Like, Stanford putout its preprint on its antibody serosurvey. And there were alot of storeys that got written really quickly.And then in the next day, therehas been the critique ripple of like, was it a good sketch? Was it biased? You know, all of that material. And I precisely wish thata lot of reporters might have waited a little bit. And now there is theLos Angeles serosurvey. And I think you could havemaybe waited and obtained a bunch of thesestudies and maybe done one careful story inone exit, or at least gotten a lot more outside voicesthan you usually would before writing that one story. Because they aren’tpeer reviewed.So you do have to treatpreprints differently. Right. And interestingly, ofcourse , none of our essays are peer reviewed. So I’m curious whatprocess you go through to make sure that your owninterpretations and analysis are sound before publishing. I merely move preprintsby space, route more people than I ordinarily would. If something’s alreadypublished in a journal, I know that it’s gone throughthat peer review process. If it hasn’t been, I will runit by a lot more outside experts than I commonly would andjust go that additional mile and certainly ask myself, doI have to write this now? Can I wait for it to go throughthat peer review process? And you can ask the author. Sometimes they’llsay, oh yeah, this has already been accepted byJAMA or The Lancet or whatever. And that gives me an extrameasure of confidence. If that’s the case, that’s helpful to know. And if not and it’s like, thisis such an important study that I need to writeabout it right now, then I get all thoseoutside voices.I try to get numerous independentoutside utters that are from a number ofdifferent institutions, get all their essays. And if everything of them arereally, truly negative, then again, I haveto ask, why am I writing about this studyin the first place? The rail just gets so muchhigher if it’s not in a journal and hasn’t gone throughthe peer review process. And I accept, Caroline, even when ProPublica or The Post or othersare doing their own analysis, we do the same thing. We go to outsideexperts and say, here are the numbersI’m forecast. Does my technique make sense? Is there a good explanationfor these conclusions? Instead than only postingsomething on Twitter or throwing it onour website, we first ordinarily talk to professionals first. Yeah, precisely. And you are familiar with, there’s a bitof self analysis in here too. Like looking at whatwe call data flavors. Does what’s in the dataquestion your basic assumptions? Does it evidence an opposite trendto what you’re expecting? Are there big gapsor negative prices where there shouldn’t be? It’s kind of like sanitychecking the data as well.And similarly, I knowthere are questions about different simulations thatorganizations are exercising. A heap of people are looking atthe University of Washington model. It has a website that’svery easy to use, predicting when crests are goingto be for hospitalizations and other issues. But there are lots ofother poses it seems, and there arequestions about what variables go into each model, how the numbers are calculated. And they can produceconflicting ensues. So that has to bechallenging to deal with. Yeah, so I did a entire columnon foreshadowing and predictions earlier on, which was partlyfor reporters and partly for the public.And I review, again, the question really is, who is your audience? And who are you writing for? And I try to keep thatat the back of my attention. Because I think there’sa difference now. If you are writing forreally a lie world, again, you have to remind them, is this an estimation? And I spoke to, forthat particular pillar, I was talking toan epidemiologist. And I said, you know, Iwas reading this sentence that somebody had writtenabout their particular example. And I said, it seemedawfully specific, where they said that thismeans that in New York this was back in early March that last week there were, it was something like 1,583 to 2,000, blah blah blah. It was like down to the digitnumber of people infected. And I was like, Iread that sentence and I feel like it givesa lay public this sense that you can be thatprecise and calculate down to a single digit howmany beings are polluted. And for me, as awriter, I would never hold that stage of precision.Because it signalssomething to a reader. I would round and usethe words around. And I said, what does this sayto you as an epidemiologist? And it was reallyinteresting because she said, I like seeing thatsort of precision. Because from oneepidemiologist to another, I can then go and redohis pose and made to ensure that our digits accord precisely. So it’s very useful fromone researcher to another. But I agree with you, fora prepare audience, that’s not the send we want to send. Because I said, what is the takeaway you would want fora lay audience? She said the takeaway I wouldwant a dispose audience to hear is it’s not 400 andit’s not a million. You’re in the low-grade thousands. So really, that’skind of the question that I always when I’mtalking to someone who’s doing pose, I say, what is the takeaway you would want for a sit audience? And certainly, shesaid with simulations, you need to be thinkingin orders of magnitude.And I are of the view that ourresponsibility as reporters is to then say, OK, so I’m goingto give an orders of importance type of number to my readers. Got it. And I’m also curious, are there any mistakes that you accompany a lot of peoplerepeatedly attaining that flaw you? One I visualize all the timeis people say, oh, there are still four millionpeople experimented, as if there’s been four million evaluations. But some of the testsrequire multiple samples. Beings could have beentested multiple times. So there are different lists. I likewise recognize “theyre saying”, oh, there are this many cases when it’s number of approved occurrences. And there are other studiesshowing there are probably many times more peoplewho’ve been infected but haven’t been tested.Yeah, the one that you justmentioned, Todd, I think is the one that I’veseen most often merely in talking with peopleand hearing that like, oh, this region had just been five suits. It’s like, well , no. I want, yes, but no. That is just beingwhat’s reported and what’s being imparted, being reported by the states. And again, that comes backto what Caroline was just saying about thisprecision, implying that we know there are 526 disputes in this county in Illinois or something. But maybe that’s on us too.I know the instinctis to try and report the data to the granularitywe have available. But maybe there arebetter ways in that we do report thedata that suggests more of this imprecisionabout the data. And that’s something I think wecan expect ourselves and address as we try to puttogether these pages that are tracking the spread ofthe disease or whatever. Another one is justpeople being exposed to types of scales andvisuals that they’re not used to seeing. So like, we’re interpret alot more logarithmic scales than we’re used to. And they don’t show things you know, rise doesn’t lookthe same way on a log scale than a linear scale. But if you’re looking at it andthink you’re on a linear scale, then you might think thingsare declining or dropping out when actually, that isvery much not the case. Yeah, I ponder, Todd, you pickedup on my biggest pet peeve, which is people not payingattention to sections, right? And I’ve kind of been this has been my soapboxrant for the longest time.It’s like please, try toget your contingents in beings. Because I consider, again, that iswhat readers be concerned about, right? They see a milliontests, and they think that that is a million people. When you say, we’re rollingout hundreds of thousands of experiments, they are able to automatically thinkthat is a million people who can get tested. And depending onthe type of testing, this is absolutely confusing. Like the CDC test, youhad to divide by two. The Abbott test, the rapidtest, it is one test per person.So depending on whichtest you are doing, it is a different equation. And it really is areporter’s responsibility to figure out what theheck is being said. And it is a way for, frankly, for officials to inflate lists. And it’s the onlyway to really get an apples to applescomparison, is if we get a testing capacity in people. And so I think that’s ajournalist’s responsibility, to always get theunits in people. That lane, we can compare stateby state, country by country. So I do think thatthat is a mistake well, a mistake, orI see a confusion that annoys me when I see that. And yeah, I thinkjust not explaining that everything should belike, this is a reported number of deaths orreported number of cases at this phase intime, as Armand said.I think those are really common. I contemplate too just this is more philosophical is just presumingwe know things. I predominantly see this franklyon Twitter and on Tv, but time this air ofwe know what to do. Like, if this statejust did this, then we would solve the crisis. No. Nobody knows what to do. We have only known this, like humanity has only known this virus since January. Well, I make, in China it wasa little bit earlier than that. But in the US, we haven’treally known it that long. And every time I diginto this, whether it is on genuinely understandinghow it is transmitted, or I recently was doing a lot ofreporting on doctors struggling to understand how bestto use ventilators, how to best treatcritically ill cases. Everybody is struggling todo their best by patients and to reallyunderstand what to do. And so I think there are noeasy rebuttals in this crisis. And I think you can give I think this is afailure of communication both by our officialsand too actually, by journalists, when wemake it sound like there is an self-evident or easyanswer, and failing to acknowledge that, to a certain degree, we are all still learning.And so that riles me, wheneverit comes across as like, well, obviously. That sounds good. Why don’t we go to questions from the audienceare starting to pile up. One that’s been upvoted themost is from a Berkman fellow, BaoBao Zhang, who wonderedhow you feel about nonexperts weighing in with their ownanalysis on Medium or Twitter or elsewhere andnonjournalists. And not all of those peopledo what journalists do, which is going to expertsfirst to vet their conclusions. I emphatically definitely sounds like, youknow, it’s a free civilization. And that’s what platformslike Medium exist for. So you’re specifically citingTomas Pueyo’s The Hammer and the Dance. I think it’s fine ifpeople want to publish, and I think that they definitelyfind their own audiences.I do think that thingslike that sometimes are I believed they attain theirown publics, basically. I had a lot of beings actuallysend me that specific post and be like, I cannotunderstand this. Can you write a version thathandholds a little bit more? Because I judge thepart where oftentimes, professionals who areexperts in their arena, whether they’redata scientists or I see this a lot, wherelike, a clinician or somebody will be writing. They tend to use a lot ofjargon and don’t break it down to the degree thatI tend to try to do. And some people do it. Some people are fantasticcommunicators naturally. But I think that’s a bia. I tend to see a lot of jargon. And so I think thereis a place for them, and then I thinksometimes, the drawback is that I think they’renot trained to be able to use the language thathelps them reach as many parties as they are unable to and to giveas much context as, I consider, a reporter wouldknow how to do. That’s my off the topof my manager answer.That’s good. There’s also aquestion about how do we deal with issues wherewe publish an article based on data, and thenthe data mutates, or the information mutates. This probably comesup all the time with health carestudies, Caroline, where a new study comes thatcontradicts a past study. Or a study’s been retracted. So how do we deal with this one? People are still passingaround the old-time essay or plot based on old-time, outdateddata and information. Yeah, oh. What a ordeal it is rightnow with developments in the situation. So one thing that I amdoing now, even more so than I ordinarily do, is I am aggressively dating my information.In my fibs, onmy sentence, I’ll say like, this happening, asof Wednesday afternoon, according to the Associationof Public Health Labs, the US had a testing capacity ofa million research as of Wednesday afternoon. Because literally byThursday morning, the digit is changing. So I try to tag as much ofmy information as possible. I’m associating a lot moreaggressively than I naturally do and likewise adding thedate and age impression. So whenever whoever comesalong to my sections sees that information, they will know as of when thatinformation was true.So regrettably, some people are not going to read that carefully. But at least the timestamp will be next to that. So I cannot go back andupdate my essays constantly. But at least the informationthat somebody speaks will have a timestamp next to that. So I think that’s probablythe best thing you can do. And then yes, update as “theres going”. And I make, again, this is where the language that you useat the time you write also helps you write. Because I too say uselanguage like, at this time, scientists understandthis to be x. So when I wasworking on a tower about asymptomatictransmission, there was a lot of languageI had in there which was like, asof now, scientists understand that whatever. So again, there’s a dateat the top of my article. I’m exercising a lot oflanguage that suggested I’m giving you the best ofunderstanding at this time.And then I’m alsolinking to studies and putting languagein that’s like, as of this interviewthat I did on this date, “its what” I was told. So I belief all of that incombination, hopefully, even if a readercomes along later, will know that thatwas information that was current at the timethat I wrote that essay in. And I think that’sthe best you can do. Yeah. And from a datastandpoint, we can either build our sheets and appsto plug into live data that updates, so that youare seeing updated data as of the times stamp at the top ofthe sheet or right on the chart or whatever. Again, we try to betransparent about when that data is updating. Or like Caroline says, wecan improve it statically, with like Illustrator, orjust save it as the static SVG and is therefore necessary to perform very clearthat this is data as of x.Otherwise, we’vebeen in situations when we’re tryingto publish a storey, and we just have to keepupdating the charts like five times because thedata hinders changing as we’re writing the fib. Yeah, and othermore subtle things. So ProPublica normallydoes certainly long, sort of deep diveinvestigations. And actually, oursocial kinfolks are used to really retweetingour storeys forever. Because we often are doingsuch long, retrospective investigations that youcould retweet our storeys like two years fromnow, and there’s no reason why somebodycouldn’t speak them again later. And we’ve completelyreconfigured that. So they no longerwill retweet a floor because they knowthe information could be totally old-fashioned. So even thinking about that, like your social approach. They will check in and be like, can I still tweet this history from our main account? Is that datum still brand-new? Like, thinking aboutthat kind of thing. And then apparently, if there’s some really major new information. Like for example, if Ihad written a whole column on asymptomatictransmission, and there’s some really major informationthat’s really relevant to know, I will introduce an update atthe top of that story.So being select. Both good points. Next question is fromEva Wolfangel, who’s a Knight Science Journalismfellow, who queries about the fact that researchers often tryto communicate uncertainty, and I suspect there are twochallenges journalists face. One is how to communicatethat same uncertainty. And then there’salso the question, do we weaken our own storiesand reporting and data when we communicate that indecision? Or are beings just going tosay, oh, it’s an estimate.It has such a wide range. It has a margin of fault. You can’t really rely on it. So how do you dealwith those challenges? I intend, I try to conveythat in describing the process of science, right? So really to give avery specific example, in the line I was working onon asymptomatic communication, there was a part where I talkedabout how new studies has been demonstrated that viralload is actually higher at the start of thedisease, course of disease for COVID1 9, which means thatyou could be more contagious even before manifestations started. But I went out of my behavior toexplain how this is unexpected because for COVID1 9′ sclose cousins, the coronaviruscousins SARS and MERS, you were most epidemic, you are the highest viral onu, in the middleof the course of disease, when your manifestations were highest. And I picture justexplaining that, which would be why yournatural presumption would be the original presumptionwas that COVID1 9 would behave the same way. It’s something that anyreader understand better, that naturally, you’dlook at historic frameworks, and you’d expect it tobehave the same way.And I anticipate trying to explainthe process of discipline cures, and I feel like Ijust over explain. And I belief showingthat skepticism, or even just sayingthings like I really was working on astory about ventilator abuse, and I had a cliniciangive me a number. And then he calledme back and was like, you know that number I “ve given you”? I know you detest this, journalists hate this, but it might modify. And I was like , no , no , no. That’s fine. That’s fine, and Iappreciate that you wanted to clarify that. So then I just supplemented aline, a very short line, saying he added thatit’s early days, and more informationwill be gathered. And I think that’s fine anda good indicator to readers that more studiesare going to happen. So I definitelyfeel like there was still highways for columnists to indicatethat for their books. And from a visualstandpoint, in terms of how to communicateuncertainty, look to the annual discussionevery hurricane season about how to plot thelikely path of a hurricane. It’s like visuals. You want to givesomebody something to look at that tries to conveythe data as best as possible.And I think in the case of thisoutbreak, the best we can do is work that into thechatter and the headline around the chart, the annotations. Say that it’s reportedcases or approved bags or reported deaths. Try and communicate theuncertainty in what’s around the chart, ratherthan the numbers, which are what’s actuallybeing reported and what we actuallyhave to chart.Got it. And I want to take on aquestion by Saul Tannenbaum, who queries about questions raised byCOVID skeptics, who will often point out when we report deaths, argue that they’re over counted or there are no COVIDdeaths, in extreme cases, and say, well, they’re reallydying from a heart attack. Or they’re reallydying from pneumonia. Or they’re dying fromsome other cause. And yes, they testedpositive for COVID, but that wasn’t necessarilywhat stimulated their fatality. How do you deal withthose types of questions? That’s interesting. I think that I don’t know that that’s auseful debate right now, right? I fantasize all you can dobecause I think you can have that debate at either intent. Because then you get intothe debates on the people who are dying at home. Did they die of COVID? Did they die of not COVID? How do you thencount the people who are the excess impactfrom COVID because they died at home because theydidn’t want to go in for help.You know, I think there areso many twirling questions around demises relatedto COVID that are going to be so hard to untangle. And I reckon as reporters, the only thing you can really do is just be really straightand truly flat and is just like, here are the numberof people who died with a positive COVID test. And just leave it at that. And then here are the numberof people who died at home, and here is how it comparesto the number of people who died at home lastyear at the same time. And show that gap ifyou are able to get that quantity from your country. I merely don’t knowthat those debates are really helpful orgetting into those weeds and trying toparse that is going to get anywhere at this object. Because you are eligible to have thatsame debate about the flu. Like, so and so hada positive flu assessment, but did they die of theirunderlying health? Their pneumoniacame from the influenza, but they also had diabetes.What does that necessitate? I simply don’t know whereyou’ll get with that. And it reminds me ofafter Hurricane Maria, when they started anddid studies of what did the excess mortalityrate look at in Puerto Rico after Hurricane Maria? I worked on the homicidereport at The LA Times for several years. And the LA county coroner, ifsomebody was shot and then died like, say, 10 several years later ofcomplications from that gunshot gale, like eventualhealth blows, it’s still ruled ahomicide because they died because of complicationsfrom that gunshot wound. So this is not justsolely restricted to a COVID 19 debate. It’s just mortalitystatistics in general. Right. Another questionthat came up is, what is the most reliablesource for COVID1 9 data? I think there are at leasta half a dozen sources of aggregated nationaldata and a couple roots with world data.Armand, is moving forward. Yeah, most reliable is the key. I make, Johns Hopkins hasreally been putting tons of work into aggregating asmuch data as possible. You can take a move throughtheir issues list on GitHub precisely to kind of get an ideaof the publication of petitions this has generated. Of course, the World HealthOrganization and then I believe a number ofmedia organizations, including us, are also tryingto aggregate at the US level, like country data and province data. I can’t tell you whichis the most accurate. I would say justfor US case tallies, we principally useJohns Hopkins data. We long ago gave up on theCDC, which is very unfortunate to have to say that. But they don’tupdate on weekends, and they are like 24 hoursbehind on their weekday revises. So we use Johns Hopkins forjust our daily case counts. In periods of testingcapacity, we chiefly point to The COVIDTracking Project.International, sometimes, depending on what it is, WHO or JHU. But again, it reallydepends on exactly what you’re trying to get at. Got it. And of course, sometimesfor terribly regional fibs, there may be onlyone possible beginning of data coming froma county or come a infirmary or somewhere. And actually, going to yourlocal health district instantly is probably goingto be the most up to date information, which willbe even faster and more up to date than going to asite like JHU, frankly. want to try to clarify. There Is too issues and questions from [? magna ?] Cheney, whose a Nieman affiliate, inviting aboutthe best practice for archiving tales. So Caroline, youmentioned having a date stamp can be one way.Yeah. The Guardian doesthis thing where they have a warning up reallyhigh, where “theyre saying” like, informing, this story islike more than a year old-time. Or they have some sortof very visible warning up high-pitched, which I alwaysappreciate whenever I see that. So that could beone way to do it. OK. Michelle O’Neillasks, what can we report that’s meaningfulwithout having the basic data that we want? And I guess that comes up withwhen we want to say, you are familiar with, what countries? Where are the hot spots? Or how is the US doingversus other countries when there are all thesequestions that we’ve brought up about how many beings areactually polluted given the differences in testing, and how many people have died given differences and weighing. And because of all theseuncertainties with multitudes, it must be really challengingto figure out what we can really say with confidence. Yeah, agreed. I think we need to attain basicassumptions or readjustments when we can, you know, fordenominators that don’t exist or for other things thatwe don’t consider reliable. Like on our sheets that register thedata that we know about cases and demises throughoutthe country, instead of normalizing, we’re looking at like known specimen perpopulation of the state, per person of the area.But again, we have to beclear that this is all precisely based on what’s being reported. OK. There was also aquestion with, what do you do when you don’thave data or intelligence, or you have conflictingdata, so you can’t be sure? Do you merely avoidwriting about it? Or do you write aboutit as best you can? It’s certainlydifficult to constitute graphs when you don’t have data. Well, I do thinkthat there is value to used to describe the lackof information, especially when you are interested in, say, yourlocal state bureau is keeping information thatshould be public, right? So I think thatthere was, early on, a lot of goodjournalism being done about the need fordemographic information, about who was beinginfected, right? And now we’re getting a lotmore of that info, which is pointing outbig problems in who is being infected. So I has not been able to dismissthat as a possible for where you startreporting on exactly the absence of good message, which can actually spur change and get you theinformation that you then want.Great. So I think we just havetime for one last-place question. Gina Pavone notes thatthere’s talk of using an app for contact marking. There’s a projectat MIT for that. And there’s also been storiesbased on cell phone data that’s been liberated. And Gina wondershow columnists deal with aspects ofprivacy or reporting on the challenges ofreleasing that data and using that typeof sensitive data and impelling the topicaccessible to a mass public? Yeah, I make, I thinkthis is a hot topic across a lot ofdifferent countries, a lot of different localities.And I conceive one, really understanding the nitty gritty of how it’sgoing to be used is important. I think there are a lot ofthink articles about these apps right now that I’mseeing, which ask a lot of philosophical andhypothetical questions. But I envision road fewerstories that really get into the innards of howthey’re going to be used, which would actually help answersome of these suppose segments. So I think that would beuseful journalism to be done. It’s much easier to be like, well, what about privacy? But then you don’t actuallyknow what’s going to happen. I consider the otherquestion though, which was raised with me, wassome public health professionals that asked, how is thisgoing to actually intersect with the existing publichealth infrastructure? Because if there are a bunchof people who have downloaded an app and it’s not talkingto public health officials and not helping themdo their actual employ, that’s also useless.So this needs to fit intothe existing public health ecosystem. So I think there’s alot of good reporting that can be done around that. And then again, this needsto fit into existing testing capacity. There’s no extent in havinga great contact tracing app if you then can’t testpeople and find out who is sick in the first place. So there’s a lotof questions about do we have a veryshiny glancing object that doesn’t mesh with theactual worlds of needs? And I think helping peopleunderstand, your readers, actually understandhow this app needs to fit into the actual workflowof continuing the virus.All of those things can behelpful to your books. And then eventually, also just like the mathematics of howmany beings would actually need to downloadthe app for this to be useful in achievingwhat it needs to do. Because there isa minimum number of people who need to havethe app for a contact discovering app to work. Got it. Thank you so much better. Thanks for the panelists. And for everyonewho tuned in, there will be a recording availablein a duet dates on the Berkman Klein Center event page.And there’s also goingto be a quick poll survey at the very end. So thanks again forArmand and Caroline, and thanks for everybodywho is watching ..