Skip to main content

Command Palette

Search for a command to run...

Nigeria’s Soybean Trade: A Case Study in Data Analysis and Global Patterns

Published
7 min read
Nigeria’s Soybean Trade: A Case Study in Data Analysis and Global Patterns

Introduction

It is my 9th week at the DataraFlow Data Science Internship, and I explored Nigeria’s soybean trade (2021–2023) using Python and UN Comtrade data. My interest in soybeans started during my undergraduate thesis in petroleum engineering, where I studied soybean oil as a substitute for diesel in oil-based muds. Curious about the commodity’s availability in Nigeria, I decided to run a detailed trade flow analysis, focusing on two related commodities:

  • 120110: Soybean seeds (whether or not broken)

  • 120190: Other soybeans (non-seed, whether or not broken)

Soybeans are more than just a crop. They are a global commodity used in food, animal feed, and even biofuels. Understanding Nigeria’s role in this market provides insights into both agricultural economics and energy alternatives.

The goal was to understand trade balances, identify key partners, track regular customers, and reveal bi-directional trade relationships.

By cleaning and analyzing three years of monthly trade flows, I discovered that Nigeria is not just participating in the global soybean market, it is thriving as a net exporter. The numbers highlight who Nigeria trades with, and why certain countries matter more than others.

Data Acquisition & Loading

The datasets were sourced from the UN Comtrade database, covering Nigeria’s soybean trade flows between 2021 and 2023. Each year’s data was downloaded as a CSV file and loaded into pandas for analysis.

Data Preparation Steps

To make the raw files usable, several preprocessing steps were applied:

  • Column Selection:
    Only the most relevant fields were retained:

    • refYear → year of trade

    • period → reporting period (month)

    • flowDesc → trade flow (Import/Export)

    • reporterDesc → reporting country (Nigeria)

    • partnerDesc → trade partner country

    • cmdDesc → commodity description

    • cmdCode → HS commodity code (120110, 120190)

    • primaryValue → trade value in USD

  • Renaming for Clarity:

    • refYearYear

    • flowDescTradeFlow

    • primaryValueTrade Value (USD) and so on

  • This made the dataset easier to interpret and aligned with best practices for readability.

  • Filtering:
    Entries where partnerDesc = 'World' were removed. These represent aggregate totals and would distort partner-level analysis.

  • Sorting:
    Data was sorted in descending order by Trade Value (USD) to quickly identify the most significant trade partners in each year.

Data Loading Challenge

While loading the CSVs, I encountered a subtle but critical issue: a missing column header. This omission caused two columns to merge, producing incorrect readings.

At first glance, the dataset looked fine but the values didn’t align with the headers. After investigating, I realized the header misalignment was the culprit.

Solution:

I manually added the missing column name to the source file and proceeded to reload the dataset into pandas. I went on to validate the fix by checking row counts and column consistency.

This experience reinforced a key lesson: data validation is not optional. Even small formatting issues can cascade into major analytical errors. Curiosity and patience were essential in catching the anomaly before moving forward.

Analysis & Findings

1. Export vs Import Balance

Nigeria’s soybean exports consistently exceeded imports from 2021 to 2023. This confirms a strongly positive trade balance, highlighting Nigeria as a net exporter of soybeans. The table displayed;

Trade Flow

Trade Value (USD)

Export

2.735933e+08

Import

4.597881e+06

A positive trade balance means:

  • Foreign exchange earnings: Soybeans contribute hard currency inflows to Nigeria’s economy.

  • Agricultural strength: Nigeria is not just consuming soybeans alone. It is producing enough to supply international markets.

  • Strategic diversification: In a country often associated with oil exports, soybeans represent a growing agricultural export story.

2. Main Trading Partners

Once the trade balance was established, the next step was to identify who Nigeria trades soybeans with most frequently and at the largest scale. By grouping the data by partner country and summing trade values, a clear picture emerged of dominant suppliers (imports) and anchor customers (exports).

  • Imports (Sources into Nigeria)

Nigeria’s soybean imports are relatively small compared to exports, but the partner breakdown is revealing China as the largest supplier across all three years, appearing consistently in monthly flows, making China Nigeria’s most important import partner. The USA followed suit as a regular supplier, especially of soybean seeds (HS 120110). These imports are likely used for cultivation and agricultural inputs rather than direct consumption. Volumes are smaller than China’s, but steady. In third place came Marshall Islands with a surprising entry in 2021, appearing as a notable supplier. Likely linked to re-export or shipping arrangements rather than direct agricultural trade.

Other occasional partners are Argentina, Netherlands, Malaysia, Morocco, Zimbabwe. These appear sporadically, often in small volumes.

  • Exports (Destinations from Nigeria)

Exports tell a very different story: large volumes, diverse destinations, and consistent demand. India happens to be the anchor customer, consistently proving to be Nigeria’s largest buyer across 2021–2023. They appear nearly every month, purchasing both seed and non-seed soybeans often in bulk shipments worth millions of USD, making them the most reliable market. Pakistan, A strong secondary buyer, especially in 2022–2023 with regular monthly presence, though not as consistent as India. Then France who emerges as a top European destination.

Other notable destinations are Canada, Turkey, Côte d’Ivoire, and smaller but recurring buyers like Nepal, Bangladesh, Algeria, UAE, Germany, Sri Lanka. The diversity of export destinations shows Nigeria’s integration into both regional African trade and global supply chains.

3. Regular Customers (Monthly Buyers)

Identifying the most consistent buyers was a critical step. Rather than just looking at totals, I wanted to know: which partners show up every month across the dataset?

To achieve this, I used a split–apply–combine approach in pandas. First, I grouped exports by partner, then applied a function to check if a partner appeared in all 12 months. Finally, I filtered for a specific case- April (Month 4) and non-seed soybeans (HS 120190) to validate the method.

def buysEveryMonth(group):
    return group['Month'].nunique() == 12   

grouped = exports.groupby(['Partner'])
regular = grouped.filter(buysEveryMonth)

filtered = regular[
    (regular['Month'] == 4) &
    (regular['commodity code'] == 120190)]

filtered

This function ensured that only true regular customers were retained i.e, those appearing in every month of the year. The filtered subset confirmed that India was the standout partner, consistently present across months and commodity codes. This coding approach highlighted the difference between big buyers and reliable buyers. India isn’t just the largest partner by value. It is also the most consistent, showing up month after month. That reliability makes India the backbone of Nigeria’s soybean export strategy.

4. Bi-Directional Trade Partners

One of the most interesting insights from the soybean datasets was identifying countries that both supply soybeans to Nigeria and buy Nigerian soybeans back. Beyond knowing who Nigeria imports from and exports to, I wanted to uncover countries that appear on both sides of the trade ledger. These “two-way” relationships highlight deeper integration into global supply chains.

To achieve this, I used a pivot table in pandas to summarize trade values by partner and trade flow (Import vs Export). This allowed me to quickly spot countries with non‑null values in both columns.

countries = pivot_table(Soybean, 
    index=['Partner'], ]
    columns=['TradeFlow'],
    values='TradeValue (USD)', 
    aggfunc=sum
)
countries.head()
countries.dropna()

This is the classic split–apply–combine pattern approach that neatly splits the dataset by partner, applied aggregation (sum of trade values), and combine the results into a single table as seen below;

PartnerExport (USD)Import (USD)
China2.377218e+063653156.272
India1.380069e+085657.908
USA1.414774e+05403621.444

The pivot table revealed three clear bi‑directional partners:

  • China: Dominant supplier of imports, but also an export destination.
  • USA: Supplies soybean seed while also buying Nigerian soybeans.

  • India: Imports small volumes into Nigeria, yet consistently the largest export destination.

Insights & Key Takeaways

The analysis confirms Nigeria’s position as a net exporter of soybeans, consistently generating more revenue from exports than it spends on imports. India emerges as the dominant export destination, while China leads as the primary supplier of imports. This dual dynamic underscores Nigeria’s integration into global agricultural flows, with India providing stable demand and China supplying critical inputs. Regular monthly buyers such as India, Pakistan, and Turkey highlight the presence of long-term, reliable trade relationships that reduce volatility and strengthen Nigeria’s export profile.

Equally important are the bi-directional trade partners - China, India, and the USA, which both import into Nigeria and purchase Nigerian soybeans. These overlapping flows reveal complex supply-demand integrations, where Nigeria imports seeds and specialized soybeans while exporting bulk outputs.

Conclusion

Nigeria’s soybean trade flows tell a story of consistency, dominance, and opportunity. With India as a reliable buyer and China as a key supplier, Nigeria sits at the crossroads of global soybean trade. The challenge ahead is scaling production sustainably to meet growing demand while leveraging these strong trade relationships.

For me, this project was more than just an exercise in data wrangling. It was a reminder that messy data hides powerful stories and with the right tools (in this case, Python, pandas, and a split–apply–combine mindset), those stories can be uncovered and shared.