1 Background

In April 2019, the Australian Government created the Competition and Consumer Regulations (i.e., the Code) which came into effect on 1 July 2019.

BIT worked with the Australian Energy Regulator (AER) and Australian Competition and Consumer Commission (ACCC) to find the best way to present consumers with information about energy pricing, to best facilitate consumer comprehension.

2,504 respondents saw one of four versions of energy pricing advertisements, and were asked a series of questions about the three concepts we were testing.

The four versions we tested were:

2 Estimation Strategy

2.1 Statistical Model

Before the experiment was run, a number of analyses were pre-specified, to ensure that these findings will be robust to future scaling and replication, and to reduce the risk of spurious findings.

For each outcome, we have presented two estimates of the impact of the energy advertisement framing that was varied:

A regression using just the characteristics of the energy advertisements as a predictor:

\[\text{Outcome}_i = \alpha + \beta \cdot \text{Treatment}_i + \epsilon_i\]

A regression using the random allocation and a set of covariates to improve precision:

\[\text{Outcome}_i = \alpha + \beta \cdot \text{Treatment}_i + \theta \cdot \text{Covariates}_i + \epsilon_i\]

We have included 3 co-variates in the analysis, in order to account for differences between participants that could impact the outcomes. These are:

Numeracy score: The number of numeracy questions (out of 4) they answered correctly. This controls for pre-existing differences in numerical ability between the participants
Education: The level of education completed, which can take on 4 values
Have not completed High School
Completed High School
Vocational/Technical Qualification
Bachelor’s Degree/Undergraduate Diploma
Post-Graduate Diploma, Masters, or PhD
CALD: Whether the individual was identified as culturally and linguistically diverse (TRUE or FALSE)

Additionally, if the outcome is binary, we have presented the results from both an standard linear regression and the estimated average marginal effect from a logistic regression. All errors reported are heteroskedasticity-consistent ‘robust’ errors.

For all analyses, we present a regression table with the estimates of interest, with standard errors in parentheses, and a bar chart showing the estimates as 4 different treatment groups. All regression tables will use ‘Control’ as the omitted category, with the estimates of interest giving the difference in the outcome from this category.

3 Balance Checks

Individuals were randomised into seeing one of the four versions of the energy advertisement. Below are summary statistics by group, showing differences in between these treatment groups by the covariates used in the full model. We did not observe any meaningful differences in the covariates.

4 Primary Analysis

4.1 Does the way the 3 key concepts are presented affect consumer comprehension, relative to control?

Measure: We indexed consumer comprehension via the overall mean percentage of questions answered correctly.

	OLS	OLS
(Intercept)	53.31% ***	61.40% ***
	(0.84%)	(1.08%)
Description	8.00% ***	8.52% ***
	(1.20%)	(1.13%)
Purpose	1.96% +	2.67% *
	(1.16%)	(1.10%)
Description + Purpose	6.86% ***	7.17% ***
	(1.24%)	(1.19%)
Covariates included	No	Yes
N	2,504	2,504
* p < 0.001; p < 0.01; * p < 0.05; + p < 0.1.

Here we find evidence to suggest that all three different versions of the ads (Description, Purpose, and Description + Purpose) significantly outperform the Control arm on consumer comprehension, indexed by the percentage of correct responses for all comprehension questions.

Participants who saw the Description or Description + Purpose ads answered the highest percentage of questions correctly, followed by participants who saw the Purpose ads.

5 Secondary Analysis

5.1 Does the way Concept A is presented affect consumer comprehension of this concept?

Measure: We indexed consumer comprehension via the overall mean percentage of questions answered correctly for Concept A.

	OLS	OLS
(Intercept)	52.42% ***	59.67% ***
	(1.25%)	(1.69%)
Description	15.22% ***	15.85% ***
	(1.73%)	(1.68%)
Purpose	3.01% +	3.76% *
	(1.66%)	(1.62%)
Description + Purpose	15.14% ***	15.43% ***
	(1.77%)	(1.75%)
Covariates included	No	Yes
N	2,504	2,504
* p < 0.001; p < 0.01; * p < 0.05; + p < 0.1.

Similarly to the primary analysis, we find that all three different versions of the ads (Description, Purpose, and Description + Purpose) significantly outperform the Control arm on consumer comprehension, indexed by the percentage of correct responses for Concept A. These comparisons all survive multiple comparison corrections using the Hochberg step-up procedure.

Participants who saw the Description or Description + Purpose ads answered the highest percentage of Concept A questions correctly, followed by participants who saw the Purpose ads.

5.2 Does the way Concept B is presented affect consumer comprehension of this concept?

Measure: We indexed consumer comprehension via the overall mean percentage of questions answered correctly for Concept B.

	OLS	OLS
(Intercept)	54.39% ***	63.49% ***
	(1.04%)	(1.32%)
Description	4.60% **	5.03% ***
	(1.43%)	(1.38%)
Purpose	0.36%	1.03%
	(1.51%)	(1.45%)
Description + Purpose	3.14% *	3.45% *
	(1.49%)	(1.45%)
Covariates included	No	Yes
N	2,504	2,504
* p < 0.001; p < 0.01; * p < 0.05; + p < 0.1.

Here we see that participants seeing the Description and Description + Purpose treatment ads answer more Concept B questions correctly compared to those seeing the Control ads. These comparisons all survive multiple comparison corrections using the Hochberg step-up procedure.

There is no meaningful difference in the number of correctly answered Concept B questions between the Control and Purpose arms.

5.3 Does the way Concept C is presented affect consumer comprehension of this concept?

Measure: We indexed consumer comprehension via the overall mean percentage of questions answered correctly for Concept C.

	OLS	OLS
(Intercept)	53.14% ***	61.05% ***
	(1.14%)	(1.44%)
Description	4.19% **	4.67% **
	(1.53%)	(1.48%)
Purpose	2.52% +	3.22% *
	(1.51%)	(1.45%)
Description + Purpose	2.29%	2.64% +
	(1.54%)	(1.50%)
Covariates included	No	Yes
N	2,504	2,504
* p < 0.001; p < 0.01; * p < 0.05; + p < 0.1.

Participants seeing the Description or Purpose ads answered a greater number of questions correctly compared to participants who saw the Control ads. These comparisons survive multiple comparison corrections using the Hochberg step-up procedure.

There is no meaningful difference between the Control and Description + Purpose arms on the number of Concept C questions answered correctly.

6 Exploratory Outcomes

6.1 Does the effect of the three different versions of the ads on comprehension differ by financial literacy?

Here we investigate whether the treatment effect on total percentage of questions correct differs by level of financial literacy.

Participants were asked a series of four questions indexing financial literacy, and were given a score of 0-4 depending on the total number of questions they answered correctly.

Very low financial literacy (0 correct): N = 85
Low financial literacy (1 correct): N = 221
Moderate financial literacy (2 correct): N = 386
High financial literacy (3 correct): N = 863
Very high financial literacy (4 correct): N = 949

To estimate impact of financial literacy on the treatment effect, we calculate the treatment effect for each level of financial literacy in a series of split regressions (using the full model with covariates), and present these estimates graphically.

	Very Low FL	Low FL	Moderate FL	High FL	Very High FL
(Intercept)	31.88%	45.15%	51.45%	54.87%	61.01%
	(6.52%)	(4.69%)	(3.36%)	(1.61%)	(1.53%)
Description	8.12%	-0.36%	6.65%	11.93%	7.74%
	(7.11%)	(4.43%)	(3.54%)	(1.84%)	(1.73%)
Purpose	0.25%	-0.95%	-0.21%	4.01%	3.27%
	(6.30%)	(4.16%)	(3.17%)	(1.79%)	(1.77%)
Description + Purpose	2.23%	5.33%	9.57%	7.65%	6.17%
	(6.98%)	(4.57%)	(3.13%)	(2.07%)	(1.87%)
N	85	221	386	863	949

Here we see that those with very low or low financial literacy have a very low rate of comprehension, overall, but we see no convincing evidence for the superiority of inferiority of one ad framing over the others. This is because the standard errors are quite large for these two subgroups, likely due to the small sample in each of those subgroups.

For those with moderate financial literacy, we see modest gains in comprehension when using either framing incorporating descriptions (but not significantly so when using Description only framing).

As expected, consumer comprehension is highest overall for those with high or very high financial literacy, and we see increases in consumer comprehension when using either framing incorporating description - but the increases in comprehension are the greatest for consumers who saw the Description only frame.

6.2 Does the effect of the three different versions of the ads on comprehension differ by CALD status?

Here we investigate whether the treatment effect on total percentage of questions correct differs by CALD status (as indexed by whether the participant spoke a language other than English at home).

Non-CALD: N = 2,192
CALD: N = 312

To estimate the impact of CALD status on the treatment effect, we calculate the treatment effect for CALD versus Non-CALD participants via split regressions (using the full model with covariates), and present these estimates graphically.

	Non-CALD	CALD
(Intercept)	61.02%	60.87%
	(1.15%)	(2.67%)
Description	9.16%	3.78%
	(1.21%)	(3.23%)
Purpose	2.50%	3.06%
	(1.19%)	(3.00%)
Description + Purpose	7.63%	3.11%
	(1.28%)	(3.27%)
N	2,192	312

Non-CALD participants show the same treatment effect as described in the primary analysis, but this significant effect is not observed in CALD participants.

For CALD participants, no single treatment frame results in increased percentage of questions answered correctly (however, this may be due to low statistical power, as the CALD sample is small, as expected).

6.3 Does the effect of the three different versions of the ads on comprehension differ by whether the participant is a small business owner vs regular household?

Here we investigate whether the treatment effect on total percentage of questions correct differs by whether the participant owns a small business.

Small business owners: N = 395
Not a small business owner: N = 2109

To estimate the impact of small business ownership on the treatment effect, we calculate the treatment effect for small business owners versus non small business owners via split regressions (using the full model with covariates), and present these estimates graphically.

	Not a Small Business Owner	Small Business Owner
(Intercept)	61.73%	56.09%
	(1.15%)	(2.58%)
Description	8.86%	5.56%
	(1.23%)	(2.96%)
Purpose	3.18%	0.56%
	(1.19%)	(2.80%)
Description + Purpose	7.25%	6.82%
	(1.31%)	(2.95%)
N	2,109	395

Overall we see that the treatment effect does not differ meaningfully when considering respondents who own a small business, versus those who do not.

6.4 Does the effect of the three different versions of the ads on comprehension differ by whether participants make the decision themselves, or with another person?

Here we investigate whether the treatment effect on total percentage of questions correct differs by whether the participant makes energy decisions by themselves, or with another person.

Joint Decision Maker: N = 955
Sole Decision Maker: N = 1549

To estimate the impact of being a sole versus joint energy decision maker on the treatment effect, we calculate the treatment effect for sole decision makers versus joint decision makers via split regressions (using the full model with covariates), and present these estimates graphically.

	Sole Decision Maker	Joint Decision Maker
(Intercept)	60.46%	61.05%
	(1.38%)	(1.63%)
Description	8.87%	7.82%
	(1.43%)	(1.87%)
Purpose	1.16%	4.83%
	(1.39%)	(1.78%)
Description + Purpose	6.96%	7.43%
	(1.51%)	(1.97%)
N	1,549	955

We do not see a meaningful difference in the treatment effects between those who are the sole decision maker regarding energy in the home, and those who make the energy decisions with another person.

6.5 Can energy consumers select the cheapest plan in the context of EME with the addition of concept A?

AER/ACCC are interested in understanding what kinds of information consumers use when making decisions on the Energy Made Easy (EME) platform, and whether adding content regarding the ‘reference price’ (Concept A) will aid consumers in selecting the cheapest plans.

For this question, participants were presented with a mock EME ad, and were asked to select the cheapest plan.

Measure: The outcome of interest is the percentage of participants correctly selecting the cheapest energy plan.

	OLS	Logistic	OLS	Logistic
(Intercept)	55.23% ***		65.96% ***
	(2.00%)		(2.66%)
Description	6.34% *	6.34% *	6.77% *	6.77% *
	(2.77%)	(2.77%)	(2.72%)	(2.72%)
Purpose	-2.76%	-2.76%	-2.10%	-2.09%
	(2.83%)	(2.83%)	(2.80%)	(2.80%)
Description + Purpose	5.95% *	5.95% *	6.24% *	6.23% *
	(2.81%)	(2.81%)	(2.79%)	(2.79%)
Covariates included	No	No	Yes	Yes
N	2,504	2,504	2,504	2,504
* p < 0.001; p < 0.01; * p < 0.05; + p < 0.1.

Both the Description and Description + Purpose frames result in a greater percentage of individuals correctly selecting the cheapest ad.

6.6 How do consumers make decisions through Energy Made Easy?

For this question, participants were presented with a mock EME ad, and were asked to select the cheapest plan.

Measure: Participants were asked to indicate how they made their decision - i.e., which aspects of the ad they paid attention to in deciding the cheapest offer.

Below we can see that few people (~10%) relied on Concept A to select the cheapest offer, with the majority of participants relying on the dollar values assigned to the plans, either via price per year, or price with discounts.

Interpret with caution: Note that due to the layout of EME, it is possible that participants were unable to distinguish between the price with discounts, and the price per year.

6.7 What term do consumers prefer to see on their energy ads?

AER/ACCC are additionally interested in understanding how they should inform consumers about who sets the reference price.

Measure: Following the EME ads, all participants were asked to indicate which term they would prefer to see on energy ads: * The Government * The Australian Energy Regulator * They are the same * Don’t know/Not sure

We can see that consumers do not have a strong preference for the terms “Government” or “Australian Energy Regulator”. Indeed, approximately 31-37% of consumers believe that the terms represent the same thing. This mimics the overall low level of knowledge we observed through the qualitative interviews.

Interpret with caution: The Description and Description + Purpose versions of the ads included information indicating that the Reference Price is set by government. As a result of being exposed to the term before, we cannot be sure whether participants in these versions of the ads truly prefer the term “government” or not.

6.8 Self-reported difficulty in understanding the 3 concepts

Measure: Before answering the questions for each concept, we asked participants to indicate how well they believed they understand the concept. For these questions, a rating of 1 = I don’t understand it at all, and 5 = I understand it completely.

	OLS	OLS
(Intercept)	3.20 ***	3.35 ***
	(0.04)	(0.05)
Description	0.30 ***	0.30 ***
	(0.05)	(0.05)
Purpose	0.30 ***	0.31 ***
	(0.05)	(0.05)
Description + Purpose	0.33 ***	0.33 ***
	(0.06)	(0.06)
Covariates included	No	Yes
N	2,504	2,504
* p < 0.001; p < 0.01; * p < 0.05; + p < 0.1.

We see that participants report a greater level of understanding of the concepts in each of the ads in the three different versions of the ads, relative to the Control wording. There are no meaningful differences between the different versions of the ads in how well participants report understanding the concepts.

6.9 Does self-reported comprehension match actual comprehension?

Here we look at the relationship between self-reported comprehension and actual comprehension, as indexed by the percentahe of questions each participant answered correctly. We calculate a correlation coefficient for the relationship, and then visualise this relationship below.

Comprehension and self-reported comprehension are only moderately related (r = 0.35). This suggests that consumers do not have necessarily have good insight regarding their comprehension of energy advertisements.

It is interesting to note here that people think that they understand the concepts - i.e., they have relatively high self-reported comprehension, overall.

But their actual comprehension (as measured in the primary analysis above) is quite low. From the relationship between actual comprehension and self-reported comprehension, we can see that bout 12% of the change in actual understanding is explained by self-reported comprehension.

This shows that even though people think that they understand the concepts, their behaviours (as indexed by answering questions correctly) don’t indicate that they clearly understand the concepts, and that they may actually make many mistakes when understanding the concepts.

Technical Appendix: Testing the Concept of a Reference Price

The Behavioural Insights Team