With a team from InvestigateTV, we studied government reports, analyzed millions of rows of data and obtained private communication to public officials.

More

In 2019 and 2020, Investigate Midwest published a story on the biggest beneficiary of the Market Facilitation Program, a U.S. Department of Agriculture program meant to pay farmers for lost profits during the Trump-era trade war.

But we hadn’t taken a look at the full scope of the program and the issues that plagued the MFP from its inception. 

When Jamie Grey, Emily Featherston and Lee Zurik of Gray Television’s InvestigateTV team approached us about a collaboration in early March of this year, they had a massive dataset, a couple of GAO reports and an idea: to do a deep dive into what went wrong in each step of the MFP. They also had spent the past couple years reporting on farm subsidies in general, as part of their “Secret Subsidies” series.

InvestigateTV had obtained the dataset via a Freedom of Information Act request. It was huge: more than 3 million rows, each one detailing a payment made through the program. 

But there was information missing. The database didn’t contain information on the crop or type of livestock a farmer received payments for. While we could see the mailing address for the person who received the payment, we didn’t know the address of their farm — just the city and county.  

As soon as we decided to take on this project, I filed my own FOIA request with the USDA, requesting emails mentioning the Market Facilitation Program. This wouldn’t pay off until we were almost ready to publish.

I also decided very early on that I wanted to create a map of payments so we could pinpoint potential areas of interest in the reporting process. Further down the line, this would allow readers to explore the data visually. 

I started summarizing and cleaning the data using SQLite, grouping the data by unique name and address pairs so that all of the payments made to the same person over the course of the program would be collapsed into one line of data. This took our dataset from 3 million rows to about 750,000, a more manageable amount of data to process. 

Then I turned the spreadsheet into data that was mappable — a process called “geocoding.” I used geocod.io to turn the street addresses into latitude and longitude points. 

I brought the data into QGIS, a computer program that allowed me to turn the spreadsheet into a shapefile, a way of storing the data as points on a plane instead of in a spreadsheet. In QGIS, I whittled down the file size by trimming off redundant or excess data, such as stand-in codes for counties and payment dates. 

Working on a dataset of this size meant that one command — for example, deleting one column — took up to an hour. I’d click one button, then leave my laptop to process while I made lunch or played with my cat. It took days to get the file size small enough to upload.

Another problem popped up when I tried to upload the shapefile to Mapbox, the online service I used to host the map and make it interactive. Even though the file size was just under Mapbox’s limit, some of the “tiles” — the chunks of data making up the map — were too large.

I used Tippecanoe code in the command line of my Macbook to reduce the tile sizes without losing the data’s integrity. I imported the points into Mapbox and used the website’s styling tools to change the dot sizes to reflect the size of the payments. 

Next, I wanted to figure out how to let users see the name, address and payment amount for each point. Mapbox doesn’t allow popups in maps embedded on websites (iframes), but I could get around the issue by placing the map on its own webpage. So I used HTML to code a basic Google Site and popups that appear when you click on the dots.

But we hit an obstacle in May that capsized all our time and effort. 

Emily Featherston from InvestigateTV found he USDA had posted MFP payment data online — and it looked different from the data the agency had provided us earlier. 

Most of the discrepancies were minor, such as differences in address formatting and payment dates. In calls with USDA officials, Emily learned that the USDA had changed the way it maintained the data between the time InvestigateTV received the FOIA response and when the data was posted online. The online data was cleaner and more accurate. 

This meant I had to go to square one with my data analysis — bringing the raw data back into my SQLite program, spending days manipulating it in QGIS, then processing it further with Tippecanoe and Mapbox. Once that was done, we could return our attention to the reporting process. 

Using Government Accountability Office reports as a guide, we explored several aspects of the program that the GAO found issues with: compliance checks, payment distribution, transparency and methodology. We interviewed the GAO, farmers who received payments, a lawyer representing farm partnerships, and agricultural policy experts. 

With InvestigateTV creating their TV package, I got to writing. The day before I needed to turn in the draft of the written article, I received an email in my inbox. It was from the USDA, providing dozens of pages of emails in response to the public records request I made five months prior.

I scanned the emails and realized they contained a new layer to the story. In the summer of 2019, as the USDA was reworking the MFP in advance of a second round of payments, industry organizations representing various agricultural products were emailing and calling USDA officials in an attempt to direct MFP money to the farmers they represented. 

I messaged the team to let them know I had a major update just two weeks before publication. After closely reading the emails again, I confirmed every organization that emailed USDA got what they wanted. We embedded the emails in the story so readers can read for themselves how industries directly lobby the USDA.

The result is a story with many layers, taking a deep dive into the administration of the MFP and the political forces that shaped it.

Top image: photo by InvestigateTV