Correctness vs. understanding

Recently, I was asked by a colleague for a clear, simple example that would show how DiscoTest items differ from the items on conventional standardized tests. My first thought was that this would be impossible without oversimplifying. My second thought was that it might be okay to oversimplify a bit. So, here goes!

The table below lists four differences between what Lectica measures and what is measured by other standardized assessments.1 The descriptions are simplified and lack nuance, but the distinctions are accurate.

  Lectical Assessments Other standardized assessments
Scores represent level of understanding based on a valid learning scale number of correct answers
Target the depth of an individual's understanding (demonstrated in the complexity of arguments and the way the test taker works with knowledge) the ability to recall facts, or to apply rules, definitions, or procedures (demonstrated by  correct answers)
Format paragraph length written responses primarily multiple choice or short written answers2
Responses explanations right/wrong judgments or right/wrong applications of rules and procedures

The example

I chose a scenario-based example that we're already using in an assessment of students' conceptions of the conservation of matter. We borrowed the scenario from a pre-existing multiple choice item.

The scenario

Sophia balances a pile of stainless steel wire and ordinary steel wire on a scale. After a few days the ordinary wire in the pan on the right starts rusting.

Conventional multiple choice question

What will happen to the pan with the rusting wire?

  1. The pan will move up.
  2. The pan will not move.
  3. The pan will move down.
  4. The pan will first move up and then down.
  5. The pan will first move down and then up.

(Go ahead, give it a try! Which answer would you choose?)

Lectical Assessment question

What will happen to the height of the pan with the rusting wire? Please explain your answer thoroughly.

Here are three examples of responses from 12th graders.

Lillian: The pan will move down because the rusted steel is heavier than the plain steel.

 

 

Josh: The pan will move down, because when iron rusts, oxygen atoms get attached to the iron atoms. Oxygen atoms don't weigh very much, but they weigh a bit, so the rusted iron will "gain weight," and the scale will to down a bit on that side.

Ariana: The pan will go down at first, but it might go back up later. When iron oxidizes, oxygen from the air combines with the iron to make iron oxide. So, the mass of the wire increases, due to the mass of the oxygen that has bonded with the iron. But iron oxide is non-adherent, so over time the rust will fall off of the wire. If the metal rusts for a long time, some of the rust will become dust and some of that dust will very likely be blown away.

Debrief

The correct answer to the multiple choice question is, "The pan will move down."

There is no single correct answer to the Lectical Assessment item. Instead, there are answers that reveal different levels of understanding. Most readers will immediately see that Josh's answer reveals more understanding than Lillian's, and that Ariana's reveals more understanding than Josh's.

You may also notice that Arianna's written response would result in her selecting one of the incorrect multiple-choice answers, and that Lillian and Josh are given equal credit for correctness even though their levels of understanding are not equally sophisticated. 

Why is all of this important?

  • It's not fair! The multiple choice item cheats Adriana of the chance to show off what she knows, and it treats Lillian and Josh as if their level of understanding is identical.
  • The multiple choice item provides no useful information to students or teachers! The most we can legitimately infer from a correct answer is that the student has learned that when steel rusts, it gets heavier. This correct answer is a fact. The ability to identify a fact does not tell us how it is understood.
  • Without understanding, knowledge isn't useful. Facts that are not supported with understanding are useful on Jeopardy, but less so in real life. Learning that does not increase understanding or competence is a tragic waste of students' time.
  • Despite clear evidence that correct answers on standardized tests do not measure understanding and are therefore not a good indicator of useable knowledge or competence, we continue to use scores on these tests to make decisions about who will will get into which college, which teachers deserve a raise, and which schools should be closed. 
  • We value what we measure. As long as we continue to measure correctess, school curricula will emphasize correctness, and deeper, more useful, forms of learning will remain relatively neglected.

None of these points is particularly controversial. Most educators agree on the importance of understanding and competence. What's been missing is the ability to measure understanding at scale and in real time. Lectical Assessments are designed to fill this gap.

 


1Many alternative assessments are designed to measure understanding—at least to some degree—but few of these are standardized or scalable. 

2See my examination of a PISA item for an example of a typical written response item from a highly respected standardized test.

Benchmarks: education, jobs, and the Lectical Scale

I'm frequently asked about benchmarks. My most frequent reponse is something like: "Setting benchmarks requires more data than we have collected so far," or "Benchmarks are just averages, they don't necessarily apply to particular cases, but people tend to use them like they do." Well, that last excuse will probably always hold true, but now that our database contains more than 43,000 assessments, the first response is a little less true. So, I'm pleased to announce that we've published a benchmark table that shows how educational and workplace role demands relate to the Lectical Scale. We hope you find it useful!

Front-line to mid-level management recruitment assessment—Lectica style

For the first time—the world's best workplace assessments by subscription!

Lectical Assessments have been used to support senior and executive recruitment for over 10 years, but the expense of human scoring has prohibited their use at scale. I'm DELIGHTED to report that this is no longer the case. Because of CLAS—our electronic developmental scoring system—we can now deliver customized assessments of workplace reasoning in real time—by annual subscription. We're calling this subscription service lecticafirst.

Lecticafirst is designed as a front-line to mid-level management recruitment assessment subscription service for positions through mid-level management.* It allows you to administer as many lecticafirst assessments as you'd like, any time you'd like. And we've built in several upgrade options, so you can easily obtain more information about candidates that capture your interest.

learn more about lecticafirst subscriptions


The current state of recruitment assessment

Two broad constructs are commonly assessed in recruitment—aptitude and personality. These assessments examine factors like literacy, numeracy, role-specific competencies, leadership traits, and cultural fit, and are generally delivered through interviews or through multiple choice or likert-style surveys. Emotional intelligence is also sometimes measured, but thus far, is not producing results that can complete with aptitude tests (Zeidner, Matthews, & Roberts, 2004). 

Like Lectical Assessments, aptitude tests are tests of mental ability. High-quality aptitude tests have the highest predictive validity for recruitment purposes, hands down. Hunter and Hunter (1984), in their systematic review of the literature, found an effective range of predictive validity for aptitude tests of .45 to .54. Translated, this means that about 20% to 29% of success on the job was predicted by aptitude. (Interview assessments focused on aptitude generally have predictive validities that are lower.) These numbers do not appear to have changed appreciably since Hunter and Hunter's review.

Personality tests come in a distant second. In their meta-analysis of the literature, Teft, Jackson, and Rothstein (1991) reported an overall relation between personality and job performance of .24. Translated, this means that about 6% of job performance is predicted by personality traits. These numbers do not appear to have been challenged in more recent research (Johnson, 2001).

Why use Lectical Assessments for recruitment?

Lectical Assessments are "next generation" assessments, made possible through a novel synthesis of developmental theory, primary research, and technology. Until now aptitude tests have been the most valid and affordable option for employers. But despite being more predictive than personality tests, aptitude tests still suffer from important limitations. Lectical Assessments address these limitations. For details, take a look at the side-by-side comparison of lecticafirst tests with conventional tests, below.

Dimension LecticaFirst Aptitude
Accuracy Level of reliability (.95–.97) makes them accurate enough for high-stakes decision-making. Varies greatly. The best aptitude tests have levels of reliability in the .95 range. Many recruitment tests have much lower levels.
Time investment Lectical Assessments are not timed. They usually take from 45–60 minutes, depending on the individual test-taker. Varies greatly. For acceptable accuracy, tests must have many items and may take hours to administer.
Objectivity Scores are objective (Computer scoring is blind to differences in sex, body weight, ethnicity, etc.) Scores on multiple choice tests are objective. Scores on interview-based tests are subject to several sources of bias.
Expense Highly competitive subscription. (From $6 – $10) per existing emplpoyee annually Varies greatly.
Fit to role: complexity Lectica employs sophisticated developmental tools and technologies to efficiently determine the relation between role requirements and the level of reasoning skill required to meet those requirements. Lectica's approach is not directly comparable to other available approaches.
Fit to role: relevance Lectical Assessments are readily customized to fit particular jobs, and are direct measures of what's most important—whether or not candidates' actual workplace reasoning skills are a good fit for a particular job. Aptitude tests measure people's ability to select correct answers to abstract problems. It is hoped that these answers will predict how good a candidate's workplace reasoning skills are likely to be.
Predictive validity So far: Predict performance (R = .53, R2 = .28). The aptitude (IQ) tests used in published research predict performance (R = .45 to .54, R2 = .20 to .29)
Cheating The written response format makes cheating virtually impossible when assessments are taken under observation, and very difficult when taken without observation. Cheating is relatively easy and rates can be quite high.
Formative value High. Lecticafirst assessments can be upgraded after hiring, then used to inform employee development plans. None. Aptitude is a fixed attribute, so there is no room for growth. 
Continuous improvement Our assessments are developed with a 21st century learning technology that allows us to continuously improve the predictive validity of Lecticafirst assessments. Conventional aptitude tests are built with a 20th century technology that does not easily lend itself to continuous improvement.

* CLAS is not yet fully calibrated for scores above 11.5 on our scale. Scores at this level are more often seen in upper- and senior-level managers and executives. For this reason, we do not recommend using lecticafirst for recruitment above mid-level management.


References

Hunter, J. E., & Hunter, R. F. (1984). The validity and utility of alterna­tive predictors of job performance. Psychological Bulletin, 96, 72-98.

Johnson, J. (2001). Toward a better understanding of the relationship between personality and individual job performance. In M. R. R. Barrick, Murray R. (Ed.), Personality and work: Reconsidering the role of personality in organizations (pp. 83-120).

Tett, R. P., Jackson, D. N., & Rothstein, M. (1991). Personality measures as predictors of job performance: A meta-analytic review. Personnel Psychology, 44, 703-742.

Zeidner, M., Matthews, G., & Roberts, R. D. (2004). Emotional intelligence in the workplace: A critical review. Applied psychology: An International Review, 53(3), 371-399.

Decision making & the collaboration continuum

When we create a Lectical Assessment, we make a deep (and never ending) study of how the skills and knowledge targeted by that assessment develop over time. The research involves identifying key concepts and skills and studying their evolution on the Lectical Scale (our developmental scale). The collaboration continuum has emerged from this research.

As it applies to decision making, the collaboration continuum is a scale that runs from fully autocratic to consensus-based. Although it is a continuum, we find it useful to think of the scale as having 7 relatively distinct levels, as shown in the table below:


Level Basis for decision Applications Limitations

LESS COLLABORATION

Fully autocratic  personal knowledge or rules, no consideration of other perspectives everyday operational decisions where there are clear rules and no apparent conflicts quick and efficient
Autocratic personal knowledge, with some consideration of others' perspectives (no perspective seeking) operational decisions in which conflicts are already well-understood and trust is high quick and efficient, but spends trust, so should be used with care
Consulting personal knowledge, with perspective-seeking to help people feel heard operational decisions in which the perspectives of well-known stakeholders are in conflict and trust needs reinforcement time consuming, but can build trust if not abused
Inclusive personal knowledge, with perspective seeking to inform a decision operational or policy decisions in which the perspectives of stakeholders are required to formulate a decision time consuming, but improves decisions and builds engagement
Compromise-focused leverages stakeholder perspectives to develop a decision that gives everyone something they want making "deals" to which all stakeholders must agree time consuming, but necessary in deal-making situations
Consent-focused leverages stakeholder perspectives to develop a decision that everyone can consent to (even though there may be reservations) policy decisions in which the perspectives of stakeholders are required to formulate a decision can be efficient, but requires excellent facilitation skills and training for all parties
Consensus-focused leverages stakeholder perspectives to develop a decision that everyone can agree with. decisions in which complete agreement is required to formulate a decision requires strong relationships, useful primarily when decision-makers are equal partners

MORE COLLABORATION

As the table shows, all 7 forms of decision making on the collaboration continuum have legitimate applications. And all can be learned in any adult developmental level. However, the most effective application of each successive form of decision making requires more developed skills. Inclusive, consent, and consensus decision making are particularly demanding, and consent decison-making requires formal training for all participating parties.

The most developmentally advanced and accomplished leaders who have taken our assessments deftly employ all 7 forms of decision making, basing the form chosen for a particular situation on factors like timeline, decision purpose, and stakeholder characteristics. 

 

(The feedback in our LDMA [leadership decision making] assessment report provides learning suggestions for building collaboration continuum skills. And our Certified Consultants can offer specific practices, tailored for your learning needs, that support the development of these skills.) 

 

VCoL & flow: Can Lectical Assessments increase happiness?

Last week, I received an inquiry about the relation between  flow states (Csikszentmihalyi & colleagues) and the natural dopamine/opioid learning cycle that undergirds Lectica's learning model, VCoL+7. The short answer is that flow and the natural learning cycle have a great deal in common. The primary difference appears to be that flow can occur during almost any activity, while the natural learning cycle is specifically associated with learning. Also, flow has been associated with neurochemicals we haven't (yet?) incorporated in our conception of the natural learning cycle. We'll be tracking the literature to see if research on these neurochemicals suggests modifications.

The similarity between flow states and the dopamine/opioid learning cycle are numerous. Both involve dopamine (striving & focus) and opioids (reward). And researchers who have studied the role of flow in learning even use the term "Goldilocks Zone" to describe students' learning sweet-spot—the place where interest and challenge are just right to stimulate the release of dopamine, and where success happens just often enough to trigger the release of opioids (which stimulate the desire for more learning, to start the cycle again).

Since psychologist Mihalyi Csikszentmihalyi began his studies of flow, it has been linked to feelings of happiness and euphoria, and to peak performance among workers, scientists, athletes, musicians, and many others. Flow has also been shown to deepen learning and support interest.

Flow is gradually making its way into the classroom. It's featured on UC Berkeley's Greater Good site in several informative articles designed to help teachers bring flow into the classroom.

"Teachers want their kids to find “flow,” that feeling of complete immersion in an activity, where we’re so engaged that our worries, sense of time, and self-consciousness seem to disappear."

Advice for stimulating flow is similar to our advice for teaching and learning in the Goldilocks Zone, and includes suggestions like the following:

  • Challenge kids—but not too much. 
  • Make assignments relevant to students’ lives.
  • Encourage choice, feed interest.
  • Set clear goals (and give feedback along the way).
  • Offer hands on activities.

If you've been following our work, these suggestions should sound very familiar.

All in all, the flow literature provides additional support for the value of our mission to deliver learning tools that help teachers help students learn in the zone.

Here are a few links to additional information:

VCoL+7: Can it save democracy?

Our learning model, the Virtuous Cycle of Learning and its +7 skills (VCoL+7) is more than a way of learning—it's a set of tools that help students build a relationship with knowledge that's uniquely compatible with democratic values. 

Equal opportunity: In the company of good teachers and the right metrics, VCoL makes it possible to create a truly level playing field for learning—one in which all children have a real opportunity to achieve their full learning potential.

Freedom: VCoL shifts the emphasis from learning a particular set of facts, vocabulary, rules, procedures, and definitions, to building transferable skills for thinking, communicating, and learning, thus allowing students greater freedom to learn essential skills through study and practice in their own areas of interest.

Pursuit of happiness: VCoL leverages our brain's natural motivational cycle, allowing people retain their inborn love of learning. Thus, they're equipped not only with skills and knowledge, but with a disposition to adapt and thrive in a complex and rapidly changing world.

Citizenship: VCoLs build skills for (1) coping with complexity, (2) gathering, evaluating, & applying information, (3) perspective seeking & coordination, (4) reflective analysis, and (5) communication & argumentation, all of which are essential for the high quality decision making required of citizens in a democracy. 

Open mindset: VCoLs treat all learning as partial or provisional, which fosters a sense of humility about one's own knowledge. A touch of humility can make citizens more open to considering the perspectives of others—a useful attribute in democratic societies.

All of the effects listed here refer primarily to VCoL itself—a cycle of goal setting, information gathering, application, and reflection. The +7 skills—reflectivity, awareness, seeking and evaluating information, making connections, applying knowledge, seeking and working with feedback, and recognizing and overcoming built in biases—amplify these effects.

VCoL is not only a learning model for our times, it could well be the learning model that helps save democracy. 

Why we need to LEARN to think

I'm not sure I buy the argument that reason developed to support social relationships, but the body of research described in this New Yorker article clearly exposes several built-in biases that get in the way of high quality reasoning. These biases are the reason why learning to think should be a much higher priority in our schools (and in the workplace). 

Transformative & embodied learning

I'm frequently asked about the relation between transformative learning and what we, at Lectica, call robust, embodied learning

According to Mezirow, there are two kinds of transformative learning, Learning that transforms one's point of view and learning that transforms a habit of mind.

Transforming a point of view: This kind of transformation occurs when we have an experience that causes us to reflect critically on our current conceptions of a situation, individual, or group. 

Transforming a habit of mind: This is a more profound and less common kind of transformation that occurs when we become critically reflective of a generalized bias in the way we view situations, people, or groups. This kind of transformation is less common and more difficult than a transformation of point of view and occurs only after several transformations in point of view.

Embodied learning occurs through natural and learned virtuous cycles in which we take in new information, apply it in some way, and reflect on outcomes. The natural cycles occur in a process Piaget referred to as reflective abstraction. The learned process, which we call VCoL (for virtuous cycle of learning) deliberately reproduces and amplifies elements of this unconscious process, incorporating conscious critical reflection as part of every learning cycle. These acts of critical reflection reinforce connections that are affirmed (or create new connections) and prune connections that are negated. Virtuous learning cycles, both conscious and unconscious, incrementally build a mental network that not only connects ideas, but also different parts of the brain, including those involved in motivation and emotion.

Learning through intentional virtuous cycles ensures that our mental network is constantly being challenged with new information, so alterations to point of view are possible any time we receive information that doesn't easily fit into the existing network. But this kind of learning is also part of a larger developmental process in which our mental networks undergo major reorganizations called hierarchical integrations that produce fundamental qualitative changes in the way we think.

Here are some of the similarities I see between transformative learning and our learning model:

  1. Both are based on developmental mechanisms (reflecting abstraction, assimilation, accommodation, hierarchical integration, chunking, qualitative change, and emergence) that were the hallmarks of Piagetian and Neo-Piagetian theory. The jargon and applications may be different, but the fundamental ideas are very similar.
  2. Both are strongly influenced by the work of Habermas (communicative action) and Freire (critical pedagogy).
  3. Both lead to a pedagogy that emphasizes the role of critical reflection and perspectival awareness in high quality learning. 
  4. Both emphasize the involvement of the whole person in learning.
  5. Both transcend conventional approaches to learning.

Here are some differences I've identified so far:

  1. Terminology: Overcoming this problem requires pretty active perspective seeking!
  2. Role of critical reflection: For us, critical reflection is both a habit of mind to cultivate (In VCoL+7, it's one of the +7 skills) and a step in every (conscious) learning cycle (the "reflect" step). I'm not sure how this is viewed in Transformative learning circles.
  3. Target: We have two learning/development targets, one is meta, the other is incremental. Our meta target is long-term development, including the major transformations that take place between levels in our developmental model. Our incremental target is the micro-learning or micro-development that prepares our neural networks for major transformations. 
  4. Measurement: As far as I can tell, the metrics used to study transformative learning are primarily focused on the subjective experience of transformation. We take a different approach by measuring the way in which learning experiences change our conceptions or the way in which we approach real-world problems. We don't ask what people think or what they learned, we ask how they think with what they learned.

Proficiency vs. growth

We've been hearing quite a bit about the "proficiency vs. growth" debate since Betsy DeVos (Trump's candidate for Education Secretary) was asked to weigh in last week. This debate involves a disagreement about how high stakes tests should be used to evaluate educational programs. Advocates for proficiency want to reward schools when their students score higher on state tests. Advocates for growth want to reward schools when their students grow more on state tests. Readers who know about Lectica's work can guess where we'd land in this debate—we're outspokenly growth-minded. 

For us, however, the proficiency vs. growth debate is only a tiny piece of a broader issue about what counts as learning. Here's a sketch of the situation as we see it:

Getting a higher score on a state test means that you can get more correct answers on increasingly difficult questions, or that you can more accurately apply writing conventions or decode texts. But these aren't the things we really want to measure. They're "proxies"—approximations of our real learning objectives. Test developers measure proxies because they don't know how to measure what we really want to know.

What we really want to know is how well we're preparing students with the skills and knowledge they'll need to successfully navigate life and work.

Scores on conventional tests predict how well students are likely to perform, in the future, on conventional tests. But scores on these tests have not been shown to be good predictors of success in life.*  

In light of this glaring problem with conventional tests, the debate between proficiency and growth is a bit of a red herring. What we really need to be asking ourselves is a far more fundamental question:

What knowledge and skills will our children need to navigate the world of tomorrow, and how can we best nurture their development?

That's the question that frames our work here at Lectica.

 

*For information about the many problems with conventional tests, see FairTest.

 

How to teach critical thinking: make it a regular practice

We've argued for years that you can't really learn critical thinking by taking a critical thinking course. Critical thinking is a skill that develops through reflective practice (VCoL). Recently, a group of Stanford scientists reported that a reflective practice approach not only works in the short term, but it produces "sticky" results. Students who are routinely prompted to evaluate data get better at evaluating data—and keep evaluating it even after the prompts are removed. 

Lectica is the only test developer that creates assessments that measure and support this kind of learning.