Reliability
We all understand (I hope) that assessment tools need to be developed with some kind of evidence criteria.
The principle of assessment reliability states:
Evidence presented for assessment is consistently interpreted and assessment results are comparable irrespective of the assessor conducting the assessment.
That means we need to have some kind of decision-making rules in place to ensure that assessment evidence is judged consistently, regardless of who undertakes the assessment.
Put simply, there can’t be any situation where John is an easier marker than Jane. Regardless of who assesses, the context of assessment or the time and location of assessment, students can expect to be assessed equally, consistently and to the same standard.
Evidence criteria
ASQA provides clear guidance about the types of evidence criteria (decision making rules) that we can use when developing assessment tools.
| From the Users Guide to the Standards for RTOs 2015 – Implementing the Principles of Assessment Develop evidence criteria (i.e. decision-making rules) to judge the quality of performance. This will help assessors make consistent judgements about competence. Evidence criteria could include: * model answers (where appropriate) * descriptions of observations needed to assess skills and application of knowledge in a practical activity. |
So, where appropriate, we could use model answers (example answers) or we could use descriptions of observations needed to assess skills and application of knowledge in a practical way (i.e. a clearly defined criteria that an assessor can observe when reviewing or observing practical performance).
Product based assessment:
One way that we can assess the practical application of skills and knowledge, is through product-based assessment.
Product-based assessment is defined by ASQA as follows:
- Structured assessment activities such as reports, displays, work samples, role plays, and presentations
- A purposeful collection of work samples (e.g., a portfolio) of annotated and validated pieces of evidence, compiled by the student
- Evidence could include written documents, photographs, videos or logbooks
Guide to Assessment Tools https://www.asqa.gov.au/media/313
There are lots of circumstances where we may use product-based assessment methods. For example:
- A photography student may submit photographs for assessment
- A business student may submit a report or project plan for assessment
- A retail student might put together a merchandising display for assessment
- A marketing student might prepare a presentation for assessment
- A hospitality student might prepare an espresso coffee for assessment
- An RPL candidate may present a portfolio of documents
All of the items above are products that students have created, and that we will need to assess.
Evidence criteria for product-based assessment
So, based on the above assessment scenarios, what are some appropriate forms of evidence criteria?
We certainly wouldn’t have an example answer as a benchmark for assessing the consistency and quality of a cappuccino (I hope).
What about an sample product? Would we have an ‘example photo’ for assessing the photography students work against? Would we have an example merchandising display for the retail student’s assessment benchmark? Probably not.
What we’d probably do on these occasions is develop a clearly defined criteria from which we’d assess the product. We’d have a checklist.
With a clearly defined criteria, we can observe performance of the student based on the product. If we’re assessing a cappuccino, we’d establish a criterion for the appearance of the coffee. For example, there may be criteria about the foam being in the centre of the cup, and the crema on the outside. There may be criteria about the edges of the cup being clean and the sides free of drips. We can observe those things when we review the product presented by the student.

In fact, I believe a clearly defined evidence criteria, in the form of an observation checklist, is an appropriate ‘evidence criteria’ for all of the product based assessment examples above.
But what about written products?
While some of the product-based examples I listed earlier are quite tactile, some of them include products that might be written. For example, a report, project plan or written presentation.
Many people believe that an ‘example answer’ is required for these types of tasks. I’d like to challenge that belief.
Let’s consider a unit like BSBWRT411 Write complex documents. In this unit students must plan, draft and finalise three different complex documents. I’d imagine that almost all assessment tools for this unit include product-based assessment.
In some circumstances, the tool might be extremely prescriptive in terms of what the content of each document must contain, but since we’re in Vocational Education and Training, I’d hope that our assessment is flexible, and that work-based students would be able to develop documents from a diverse range of contexts and subject matters.
Would it be useful to create example documents as the ‘evidence criteria’ when assessing this unit? How effective would the example be in guiding assessors if they are looking at a vast variety of documents from different contexts? Even if it is the same subject or content being written about, it’s obvious that the way each student composes the letter will be different. How will an example help us assess those variations in a consistent way?
I don’t think it would. In fact, I imagine that this type of evidence criteria would cause more confusion for assessors because there is now a requirement for the assessor to somehow interpret the requirements from an example, and then we’re relying on the hope that all of our assessors come up with the same interpretation. We could be asking assessors to compare a business report about different subjects, for different organisations, and a hundred other different variables. We’re asking them to compare apples with oranges.
A clearly defined criteria (checklist) however, would give assessors very clear guidance on what to look for when reviewing each document. They can easily identify if the document met the criteria, which should be made up of observable components that assessors can clearly measure the students work against. For example a specific criteria may describe the standards of grammar, presentation and how the content aligns with the document purpose and requirements.
When we have tight unarguable critiera, then we’ll get consistent assessment decisions.

Don’t overcomplicate whats already complicated
Before I go any further, it is definitely not ‘non-compliant’ to create an example answer for written product if it is combined with a ‘clearly defined criteria’. But we don’t need both an example answer and an assessment checklist.
In fact, I’d argue that having both can actually be confusing. Which one should the assessor use? How do they use both criteria? How do the competing criteria interrelate? If you’re using two forms of evidence criteria, you could even have an issue of reliability if you don’t have clear instructions to the assessor about how to use both evidence criteria when reviewing the assessment evidence.
But the question I’d like anyone using two forms of evidence criterion to ask themselves is why. Why are we requiring both? Is it because we want to be extra sure we’re compliant? Is it compliance anxiety? Are we over-complicating things?
We have so much red tape in RTOs, and I really do think there are areas where we end up making things worse by overcomplicating the already complicated.
Don’t enforce double requirements like two types of evidence criteria / benchmarks just because someone told you should, or because its always been done that way. And don’t act on something somoene else told you an auditor told them – they may have misinterpreted what was said (which I confess to doing myself.)
Challenge any process or decision that is creating additional work by asking yourself what clause in the standards is the basis of the process. If the process isn’t required by the standards then the process isn’t a compliance requirement. It may just be someone’s compliance anxiety making everyone’s job that little bit harder.
Coleen

I agree with everything you say Coleen and am amazed it still needs to be said. Surely by now (says me who has been working in the VET sector more than 30 years) assessors would know that a “sample” is not a marking guide or a set of criteria. The only times I’ve used samples have been to show students how marking criteria have been addressed by students in the past. The criteria are the benchmarks used for assessment; the sample is simply that.