By Yan Naung Oak, Phandeeyar
This is an abridged version of a module for an extractives data training online course that will be featured on School of Data’s website.
Extractives data comes from a wide array of sources. It’s the job of the person working with this data to extract, combine, analyse and communicate it. That’s where data visualisation comes in. If you have a basic grasp of working with data, you are probably familiar with basic chart types that are used in data visualisation.
There are many online resources which list taxonomies of data visualisation types and guides you to choosing the appropriate tool and type of data to be used with each type of visualisation. Datavizcatalogue.com and Datavizproject.com are great examples of taxonomy sites. Dataviz.tools is a very useful site that catalogs all the different tools available for data visualisation. The Financial Times’ Visual Vocabulary chart provides a handy guide to match data types to visualisation types.
When we are working with data from the extractives sector, we often find that there is a need to communicate “flows”. According to FT’s visual vocabulary chart, sometimes we need to “Show the reader volumes or intensity of movement between two or more states or conditions. These might be logical sequences or geographical locations”. Examples of flows in extractives data include:
In this module, we will create visualisations of the first of those examples, and the we will use data about the global uranium industry to demonstrate how to prepare data and create these visualisations.
We will be using a free-to-use online tool called RAW. RAW is designed to create highly customizable, static visualisations (i.e. not interactive) that provides the functionality to easily create interesting visualisation types that many off-the-shelf tools usually do not provide.
RAW is an especially useful tool for designers because it lets you export visualisations as SVG files which can be further edited using vector graphics software such as Illustrator. You can check out some of the beautiful visualizations that are created using RAW in the gallery section of their website.
For developers, RAW offers a lot of customizability too. It is completely open source, and is build on top of D3.js, which is the most popular web data visualisation framework. If you’re ambitious and know a bit of D3, you can even add your own new chart types to RAW.
If you’re sold on how awesome RAW is, great! Let’s get started.
Sankey Diagram of Revenue Flows
The first kind of chart we will make is a sankey diagram that shows the revenue flows to government from the extractives sector as presented in an EITI report. Specifically, we will look at the revenue flows reported in Kazakhstan’s 2014 EITI report. “Wait, I thought we were going to focus on data from the uranium industry? What’s that got to do with Kazakhstan?” I hear you ask. Stay tuned, you will see why in the next section.
You can access the PDF file of the report from the global EITI website here, or from the google drive from the data section of the EITI website. We are specifically interested in this table on page 61 of the report:
We can use Tabula to extract the report. After a bit of manual cleaning, we can get a table in Excel or Google Sheets that looks like this:
The first 3 rows of data are directly from the PDF table’s first five rows of data (excluding the share of total as % row). The last row “Non-Extractive Receipts”, is calculated with a simple formula, the “Tax Receipts Total” row minus the sum of the “Oil and Gas Receipts” and “Mining Receipts” rows.
Let’s look at what the column names mean:
We have to always remember the context behind the data we are trying to visualise. Especially in the extractives sector, the data is very complicated and tied up with the individual country and/or company’s policies. In Kazakhstan’s case, the total receipts are broken down into state budget and the national fund. The state budget is then broken into the republican budget and the local budget. In addition to that, there is also a special tax that companies in the oil sector have to pay that is not included in the state budget but included in the national fund. Writing it all out in text makes it sound quite confusing, and that’s why we’re visualising it in the first place.
RAW is incredibly easy to use, but the most difficult step is making sure the data is in the correct shape for the chart that you are trying to make. Notice the color coded cells in Table 2 above? Those are the figures that we want to chart using RAW. But why are we ignoring the first two columns and the first row? It’s because the figures in those columns and row are just sums of the other figures, and RAW will automatically sum up the figures for you. In general, you only want to give to RAW the most disaggregated data.
Now that we know which are the figures we want to use, we still have to reshape it into a format that works for RAW’s sankey diagrams. Sankey diagrams have a series of stages, with the flows diverging or converging at each stage. Hence, we have to reshape the data so that it will look like Table 3 below. The color codings on the cells with the numbers are the same as in Table 2, so you know where each of the numbers go.
As you can see, we have divided the categories into different steps to show how each item is broken down into subcategories. Once you have this table of data prepared, we can go over to the RAW web app (apps.rawgraphs.io) to start visualising.
In the first screen that you see, you just copy and paste the data directly from your spreadsheet. Make sure to change the format of the numbers in your table so that they don’t contain any thousand separator commas (i.e. we want it not like this: 1,000,000, but like this: 1000000).
If the data is acceptable by RAW, the bar below the text box will turn green with a little thumbs up icon, and it will tell you how many rows of data has been loaded. In the top right corner, you can change the view of the data to a table view to see the data more clearly.
Scroll down. Once the data is loaded, RAW will let you choose the type of chart we want. The sankey diagrams that we want are called Alluvial Diagrams in RAW (there’s a subtle difference between the two but the terms are often used interchangeably. You can refer to the dataviz project’s pages on sankey and alluvial diagrams to see the difference). Click on Alluvial Diagram in the list of charts.
Next, we have to choose which columns from the data we want to visualise. Since we have pre-prepared the data to fit the sankey diagram format on RAW, this step is quite simple. Drag the column names into the boxes as shown below.
After that, you’re basically done! Scroll down further to see what the chart looks like.
The chart updates live depending on what columns you drag into the boxes in the “Map Your Dimensions” section, so you can play around to see what kind of changes your choices make to the chart. For example, if you don’t include anything in the “Size” box, RAW will just assume each of the flows are of the same size, as seen below:
There are some limited options for changing colors and dimensions on the left, but for real customization, RAW itself is not the best tool. It is best used in conjunction with a vector graphics editor like Illustrator to really polish up your charts. RAW especially accommodates for this kind of importing to a graphic editing software. Scroll down for the Download section to see how.
If you are satisfied with your chart and want to use it as an image, choose “image (png)” from the dropdown, give your file a name, click download, and you’re done! However, there are two other formats that you can get the chart in. Select “vector graphics (svg)” to get it in format which can be edited further in a vector graphics software. If you want to embed the chart in a web page, you can copy and paste the code in the “Embed SVG Code” box into your HTML. There is an additional option to download the chart’s data model in JSON format, but that option is for more advanced users and we won’t cover that in our tutorial.
That’s it! Making charts in RAW is super quick and simple. No need to register for accounts, everything is completely free (not “freemium”), and it’s all done on a simple web app on a single page.
Bump Chart of Uranium Production by Country
Next, let’s try another useful chart type that takes data in a different shape from the sankey chart.
In this section, we will visualise how uranium production has changed over time by country. The dataset we will use is from the World Nuclear Association’s page on World Uranium Mining Production. We want this table:
By Scott Sellwood, Oxfam America
This post originally appeared on politicsofpoverty.oxfamamerica.org on September 20, 2017
New research from Oxfam’s partners in Peru shows – yet again – how hard it can be for governments to protect the tax base over the life of a mining project (and hold mining companies accountable).
For many countries, tax and other payments from oil and mining companies represent an important source of government revenue. A case in point is Peru, where the government receives billions each year from companies in the extractive sector. But is Peru receiving all that it should be from these companies?
Last month, Peru’s Supreme Court ruled that its tax regulator (SUNAT) could finally recover millions in lost revenues from Peru’s largest copper mine, Cerro Verde. For the last six years, SUNAT has fought to recover $250 million in unpaid mining taxes between 2006 and 2009. Of this, $140 million is due to be paid to the local government of Arequipa – the region where the mine is located – under Peru’s decentralized mining, oil, and gas revenue sharing rules. These payments will help pay for urgently needed public investments. The Supreme Court appeal was the latest attempt by Cerro Verde to avoid paying what the government says is due.
Oxfam’s partner, Grupo Propuesta Ciudadana (GPC), has followed the Cerro Verde case closely and analyzed the publicly available data. At the center of the now six year fight to recover the lost millions is a tax stabilization agreement signed by then-President Alberto Fujimori in 1998, who is now imprisoned for corruption and gross human rights violations. The company argues that this agreement entitled it to tax exemptions related to its first major expansion investment in 2006 when it invested $900 million to nearly triple its annual production. Peru’s tax regulator disagrees, as do their courts.
Peru is right to be pursuing these unpaid taxes—but what if this is just the tip of the iceberg? GPC argues that they should be trying to recover more from the Cerro Verde mine. Their analysis shows that between 2006 and 2011 the mine failed to pay an additional $200 million in taxes. Cerro Verde, in their own financial statements, state that if they lose all the appeals they will owe $544 million in unpaid taxes between 2006 and 2013.
Further, between 2005 and 2012 (the “boom” years for mining companies around the world) GPC estimates that Cerro Verde generated upwards of $5 billion in tax credits, as a result of overly generous fiscal terms. And last year, a second major investment by Cerro Verde allowed copper production to further double (500,000 tons in 2016). This is a major concern for the tax justice groups in Peru. Basically, despite production increasing and commodity prices recovering, a second tax stabilization agreement signed in 2015 (allowing for accelerated capital depreciation) is likely to mean that Cerro Verde’s taxable income for the next few years is effectively zero.
These discretionary tax exemptions are already having a huge impact on budget transfers to Arequipa: since 2012, subnational transfers from mining have collapsed (from an average of 70 percent in 2012 to just 2 percent in 2016). GPC cautions that these revenues are unlikely to recover until 2019 or 2020, at the earliest.
In just the last two years Oxfam has commissioned similar case study research in Cambodia, Kenya, Malawi, Zimbabwe, and Niger—which each map government risks to revenue. Understanding oil, gas, and mine economics at the individual project level allows us to understand how national tax policy, royalty policy, subsidies and other investment incentives affect the amount and timing of revenue being produced by extractives projects for government coffers – and then into investments that yield inclusive human development outcomes. For our partners and allies, it is at the project level where revenues are secured or lost and it is where the real transformative potential for those revenues to support pro-poor development outcomes rests – as opposed to “economic growth,” and its often false promises of sustainable and inclusive jobs, infrastructure, or voluntary corporate social responsibility commitments.
Like the Cerro Verde case, these case studies show how countries that are heavily dependent on minerals or hydrocarbons for government revenues lose taxes from a combination of poorly negotiated, overly generous, and secretive contracts, and weak fiscal regimes vulnerable to abuse. Unlike Peru, not all governments have the wherewithal to audit multinational mining companies, and stay the fight through years of appeals.
But it’s not all doom and gloom. Despite the seemingly infinite ways large mining, oil, and gas companies can avoid paying taxes in countries where they work – as new research from PWYP Canada shows – the pathways are not unlimited. There are clear patterns and concrete legal, policy, and administrative solutions that can minimize these risks.
In Peru, for example, the government should:
Peru’s fight to recover lost revenues is not unique. Too often, countries with significant mineral, oil and gas resources fail to secure a fair share of the revenues generated by these projects. Such losses (which some global estimates put in the billions) are, quite simply, a matter of life or death. The lost billions represent money that should have been spent on building schools and hospitals, paying teachers, doctors and nurses, and providing equal access to safe drinking water or health care, among other urgent development priorities.
For more than ten years, Oxfam has fought for law and policy reform to require public disclosure of project-by-project payments, contracts, and beneficial ownership. We continue to defend anti-corruption laws like Section 1504 of Dodd-Frank and are now seeing a flood of new disclosures from laws in the EU and Canada. These long fought for gains are now allowing us to better understand how individual mining, oil, and gas project revenues are lost and we are ramping up our campaigns to stop them.
Scott Sellwood is a Program Advisor for Extractive Industries at Oxfam America.
Click here for the archives to see our full list of posts.