December 31, 2020•1,697 words
2020 wasn't great! I personally haven't had anything close to the worst of it. I'm healthy(ish), nourished, and sheltered from the elements. What's more, I was fortunate enough to get enough work, enough resources, and enough support to keep walking the crooked path of freelance journalism.
Indeed, the circumstances of this year gave me the space and time to significantly change my own methods of work, or develop some skills I was already working on. I believe those changes were for the better. I have talked (ad nauseum, perhaps) to those I know well about some of these changes, but I have not explained them systematically to anyone, not even to myself.
You can think about this post as my 2020 version of the "best of" lists that so many of my fellow freelancers posted this year, just as I have done in the past. Thefre are three parts (they all require a little explanation). I am going with this one first because I contains story plugs, and thus hews to the freelancer's December ritual.
I hope that spelling them out might be useful to others, even if it is an example of what not to do. Holler at me on Twitter or email me on email@example.com if you want to talk more about it all.
1. Open Source Intelligence
Do you remember in 2019 when a bunch of journalists got laid off, and 4chan whipped up0 a trolling campaign telling journalists to "learn to code"? People really should be careful what they wish for.
This year I really worked hard on improving my chops and my methodological rigour when it comes ot the use of opewn source intelligence techniqyues. I think the stories I have linked to at the end show how these techniques can be helpful.
The concept of Open Source Intelligence comes from the world of national security, where it is necessary to distinguish certain kinds of information gathering from more covert means. In practice, it often involves scouring the Internet for information that persons or organizations of interest have left lying around, intentionally or otherwise.
The concept, and the discipline, has been expanded and adapted by practitioners in law enforcement and cybersecurity. In the latter case, in particular, it has given rise to an ecology of software tools and techniques, many of which are free and open source, for getting information of all kinds from social media, websites, images and video, the so-called "deep web" (which should be distinguished from another source, the "dark web"), and more.
With some simple command line tools, I can download the entire history of a Twitter account, the contents of an instagram feed, or an email address's history of being caught up in data breaches. There are frameworks and even entire operating systems devoted to or optimised for OSINT. (To get nerdy: I use a lot of tools from the BlackArch repository on my Arch-based Manjaro Linux OS)
Using some of these tools does involve getting under the hood of your computer somewhat - certainly more than you are generally expected to in most J School courses. . One upside of not being able to leave the house is that I have had long stretches of uninterrupted time in the home office to do just that.
For some, OSINT is more of a hobby than a job. Amateur practitioners have made their own contribution by being much more open and collaborative in sharing tips and techniques on a range of blogs, podcasts and more. OSINT has now become somewhat more than a cottage industry: alongside the free stuff there are expensive tools and packages, and it is also now possible to be a certified OSINT investigator.
There are many OSINT "influencers", some of whom like Michael Bazzell, offer training and very interesting how-to books.
Increasingly, activists also use these techniques very skilfully, including in identifying extremists in their communities.
Given all of this, it is amazing to me that this discipline - at least as a set of tools and techniques assembled into a consistent methodology - has not made more inroads into journalism! It's true that specialist outlets like Bellingcat have offered an example that some media outlets have made moves to follow. And I am aware of many colleagues who make excellent use of these techniques in their reporting, even if they don't call it open source intelligence.
OSINT techniques can be employed at minimal cost, are often spectacularly effective, already have well established methods and tools, and are eminently teachable. Although some schools (Stanford is one) do teach some of this under various guisies, it ought to be more thoroughly incorporated into the tradecraft of investigative journalism, starting at the level of journalism education.
So for my boast-list I picked out five stories that I wrote or co-wrote that used tools and techniques derived from the OSINT world to carry out open source investigations:
I used OSINT all along in this story that identified Rinaldo Nazzaro as the founder of The Base. But one crucial reporting sequence would have been impossible without OSINT. A reverse image search on Yandex using public photos of "Norman Spear" yielded a Russian ad for English lessons with a photo that looked like the same guy. I ran a phone number from that ad through a tool that accesses data from third-party caller ID services. In Cyrillic characters, it yielded two names, one of which, "Ron", matched the name I already had from public records reporting for The Base's leader: Rinaldo or "Ron" Nazzaro. It also prompted me to use the other name - his wife's - plus a cyrillic transliteration of Nazzaro in a Yandex search. This in turn yielded a third-party archive of Nazzaro's wife's since-deleted page on Russian social media site, VKontakte. That page contained photos from family holidays, domestic life, and even photos of the pair's wedding. At that point, I had multiple, reinforcing forms of confirmation of his identity.
Bellingcat With Robert Evans
This was a kind of companion piece to another investigation Robert and I did together which revealed the links between a then-prominent anti-lockdown protest group, American Revolution 2.0, and various far right figures, including their web designer, who was the then-proprietor of mymilitia.com, and the web designer and administrator for a neo-Nazi record label's site. That initial story established the relationship between Chad Embrey and a wide range of sites by analysing documents from mymilitia.com that I obtained by using a
site:mymilitia.com filetype:pdf search string on Google, selecting for the maximum number of results per page, and quickly downloading the results with the 'downthemall' browser extension. Embrey's ownership and management of a range of sites was established by analyzing DNS and whois information.
In this expanded sequel, there were few flashy tricks. We did use various 4chan archives and pushshift.io for historical Reddit searches, but really this was mostly about a long-term, thorough exploration of the emergence of the boogaloo subculture online. It may be the most-cited piece I have been involved in writing.
Sometimes it is tempting to think of open source investigation as being composed of tricky stuff: extraordinary feats of geolocation, or the use of an obscure tool to find the key email address, IP address, or phone number. But it is also important to properly process and manage the resources we have. I will offer details on this at some other time, but this story was possible because I was able to turn some leaked screenshots into searchable PDFs with OCR, cross reference them with voice-to-text transcriptions, and further compare them with other leaked social media materials. All of this involved getting familiar with some command line tools and applying them consistently to a well maintained repository of materials.
Fellow freelancers will recall that there was a time in Q2 in which only Coronavirus-related stories were saleable, and I wrote my share. This one is a favorite because it came from the most basic, and in many ways most effective form of open source investigation, one that I call "aggressive googling", and most refer to as "Google dorks". Every week, I spend some time putting in some long search strings to see what comes up. As I recall, this one was something like:
filetype:"xls | xlsx | doc | docx | ppt | pptx | pdf" inurl:"gov | org" "coronavirus | covid-19" "sitrep | situation report"
Or some iteration thereon.
I then time-limited the results to the previous week or month using Google's Tools button. Multiple copies of a report showed up which offered many details, including the effects of a supply squeeze on CO2. The rest of the reporting process simply sought expert, industry and government comment on the issue at hand.
This story started with a tip, though the tipster thought that they were passing on something else altogether. No spectacular tricks - indeed it is a good example of a method starting to get bedded down as standard operating procedure. I can still pull up screenshots and mhtml files from Facebook pages and websites associated with the Pozzarros, who ran a formidable fake news network until Facebook acted on their pages following this story. I always document everything in this way! I was able to confirm some elements by looking at the DNS records of their website, and I find and preserve historical DNS records for every domain that a story might intersect with. I used Crowdtangle to assess the impact and reach of the Porrazzos sites, and I think traffic and reach is one measure of newsworthiness.
I think it reflects the mindset that I had fully taken on by the end of the Trump era: social media is not some adjunct to politics, political organizing, and radicalization: it is the main arena for these activities, especially on the right.