factbased: Index decomposition with R

Few days ago, I finally finished a small package ida. It enables you to analyse contributions of underlying factors to the change in an aggregate, using methods based on index number theory. These methods have become popular by, but are not restricted to, investigating the change of CO2 emissions.

Here is a chart that shows what the change of population, welfare, efficiency and fuel substitution contributed to CO2 emissions:

The numbers refer to Gt CO2. The data comes from Worldbank, however we treated missing values rather uncautious here. So the result may or may not be valid. However, it puts the efforts in perspective: Clearly the reduction of the energy use per GDP has not been capable to compensate the additional emissions from population growth and income per capita growth. The carbon intensity, i.e. the emissions per energy unit, remained nearly unchanged.

Here is how to produce the chart:

First, you need the package ida, and — for this example — its suggested dependencies. The ida package is not yet on CRAN, but on a self-hosted repository. Here is how to install

> install.packages(ida, repos = c(getOption("repos"), 
+     "http://userpage.fu-berlin.de/~kweinert/R"), 
+     dependencies = c("Depends", "Suggests"))

For Windows users, only a R 2.13 binary is available, users of other versions need to add the type="source" parameter.

Next, we load the example data. We represent the CO2 emissions as a product

C(t) = p(t) * w(t) * i(t) * s(t)

with

p(t), the world’s population
w(t), the income per person, i.e. the GDP per capita
i(t), the energy intensity, i.e. the energy per GDP
s(t), the carbon intensity, i.e. the CO2 emissions per energy used

This decomposition of the CO2 emissions is attributed to a researcher named Kaya. The data above on country level is attached to the ida package and can be aggregated with the transform_kaya function

> world <- transform_kaya("world")
> world
          variable     Y1990     Y1991     Y1992     Y1993
1       population 5.2527625 5.3367567 5.4173742 5.4988154
2         gdp_pcap 6.7069964 6.6873499 6.7029494 6.7184605
3 energy_intensity 0.2405942 0.2397186 0.2351846 0.2331813
4 carbon_intensity 1.9561772 2.0998782 2.5575150 2.5694175
      Y1994     Y1995     Y1996     Y1997     Y1998
1 5.5792484 5.6611426 5.7409887 5.8208185 5.8997860
2 6.8261435 6.9418972 7.0965323 7.2881507 7.3668336
3 0.2277819 0.2262243 0.2238883 0.2166251 0.2124994
4 2.5612014 2.5424658 2.5441399 2.5385912 2.4862305
     Y1999     Y2000     Y2001    Y2002     Y2003     Y2004
1 5.978254 6.0559912 6.1329413 6.209233 6.2854260 6.3613650
2 7.541670 7.7996127 7.8757705 7.986565 8.1649870 8.4642982
3 0.208676 0.2033032 0.1995896 0.198509 0.1987728 0.1982568
4 2.442819 2.4734731 2.4804332 2.471007 2.4972474 2.5050409
      Y2005     Y2006     Y2007
1 6.4371665 6.5133449 6.5899233
2 8.7438915 9.0785423 9.4397475
3 0.1943548 0.1897489 0.1850416
4 2.5204078 2.5227782 2.5390367

Now we can calculate the contribution of the four factors. Currently there are two methods implemented: the additive Laspeyres decomposition, which usually has a residual term of unexplained change; and the log-mean Divisia Index I decomposition, which is not really well defined for negative factors. More details on these and other methods can be found in Alexia van der Cruisse de Waizier’s thesis. We choose the latter method here

> lmdi <- ida(world, effect = "variable", from = "Y1995", 
+     to = "Y2007", method = "lmdi1")
> lmdi
Method:  lmdi1 (additive) 

Index Values:
   Y1995    Y2007 
22.60355 29.22665 

Decomposition results:
         variable         Y1995_Y2007
       population    3.91539472647059
         gdp_pcap    7.92156787903755
 energy_intensity   -5.17907244220765
 carbon_intensity -0.0347851393004786
     Total change    6.62310502400001

The package is capable of more features, for instance is aggregating over up to two levels supported. This is a topic for a later post.

2011-07-09

Index decomposition with R

1 comment:

Archive

Labels

Karsten W.'s