Day 17

Feed the Future PBS RAG

Feed the Future PBS RAG

During my time at USAID, I had the opportunity to work on the Feed the Future Initiative, a United States government initiative designed to end global poverty, hunger, and malnutrition. The strategy centered on strengthening agricultural systems and supporting smallholder farmers across a range of countries. It was an ambitious effort that recognized a simple truth: sustainable development begins with local food systems and the people who depend on them.

I worked on a team responsible for the quantitative backbone of the initiative. Our role spanned annual reporting and the stewardship of large scale, population level datasets collected from partner countries. The data collection process was rarely straightforward. Surveys were conducted in remote and hard to reach areas, often in places where reliable infrastructure was limited. In many cases, the information gathered represented the only systematic data available about those populations. That reality carried both weight and responsibility.

Collecting the data was only the first challenge. Making it meaningful was another. The resulting reports were detailed and nuanced, reflecting the complexity of agricultural systems, economic pressures, and nutritional outcomes. They were also dense lengthy documents filled with careful language and layered interpretation necessary for accuracy, which made it difficult for policymakers and stakeholders to quickly grasp key insights.

Our team spent significant time bridging that gap. We developed dashboards to surface patterns and trends. We produced high level summaries that distilled complex findings into accessible narratives. The goal was not to oversimplify but to clarify, to ensure that evidence informed decisions in a timely and practical way.

Recently, after spending considerable time building AI systems and experimenting with retrieval augmented generation, I began to see those earlier reports in a new light. What if the historical population based surveys and their accompanying analyses could be ingested into a RAG system?

What if policymakers, researchers, or practitioners could query decades of work conversationally, drawing from carefully curated evidence without having to manually search through hundreds of pages?

Building such a system is not something that can be completed in a single day. It requires thoughtful document processing, metadata design, chunking strategies, evaluation protocols, and attention to the nuances embedded in the original reports. Still, I have begun sketching a blueprint. I am mapping the architecture, outlining what data preparation would entail, and starting to process the documents themselves.

The prospect is energizing. Initiatives like Feed the Future generate enormous amounts of knowledge, much of it locked inside static PDFs and archived reports. With the right infrastructure, that knowledge could become far more accessible and interactive. It could support new research questions, inform future development programs, and preserve institutional memory in a way that static documents never quite achieve.

There is more work ahead before this idea becomes reality, and it will likely be a project I return to over time. Yet the possibility remains compelling: to ensure that the insights generated through years of fieldwork and analysis continue to inform efforts to reduce hunger and poverty long into the future.

Here is the website.  Not fully funcitoning yet, but you get the idea.  Code on Github

← All Projects