Posts Tagged cognitive development

What is a holistic assessment?

Thirty years ago, when I was a hippy midwife, the idea of holism began to slip into the counter-culture. A few years later, this much misunderstood notion was all the rage on college campuses. By the time I was in graduate school in the nineties there was a impassable division between the trendy postmodern holists and the rigidly old fashioned modernists. You may detect a slight mocking tone, and rightly so. People with good ideas on both sides made themselves look pretty silly by refusing, for example, to use any of the tools associated with the other side. One of the more tragic outcomes of this silliness was the emergence of the holistic assessment.

Simply put, the holistic assessment is a multidimensional assessment that is designed to take a more nuanced, textured, or rich approach to assessment. Great idea. Love it.

It’s the next part that’s silly. Having collected rich information on multiple dimensions, the test designers sum up a person’s performance with a single number. Why is this silly? Because the so-called holistic score becomes pretty-much meaningless. Two people with the same score can have very little in common. For example, let’s imagine that a holistic assessment examines emotional maturity, perspective taking, and leadership thinking. Two people receive a score of 10 that may be accompanied by boilerplate descriptions of what emotional maturity, perspective taking, and leadership attitudes look like at level 10. However, person one was actually weak in perspective-taking and strongest in leadership, and person two was weak in emotional maturity and strongest in perspective taking. The score of 10, it turns out, means something quite different for these two people. I would argue that it is relatively meaningless because there is no way to know, based on the single “holistic” score, how best to support the development of these distinct individuals.

Holism has its roots in system dynamics, where measurements are used to build rich models of systems. All of the measurements are unidimensional. They are never lumped together into “holistic” measures. That would be equivalent to talking about the temperaturelength of a day or the lengthweight of an object*. It’s essential to measure time, weight, and length with appropriate metrics and then to describe their interrelationships and the outcomes of these interrelationships. The language used to describe these is the language of probability, which is sensitive to differences in the measurement of different properties.

In psychological assessment, dimensionality is a challenging issue. What constitutes a single dimension is a matter for debate. For DTS, the primary consideration is how useful an assessment will be in helping people learn and grow. So, we tend to construct individual assessments, each of which represents a fairly tightly defined content space, and we use only one metric to determine the level of a performance. The meaning of a given score is both universal (it is an order of hierarchical complexity and phase on the skill scale) and contextual (it is provided to a performance in a particular domain in a particular context, and is associated with particular content.) We independently analyze the content of the performance to determine its strengths and weaknesses—relative to its level and the known range of content associated with that level—and provide feedback about these strengths and weaknesses as well as targeted learning suggestions. We use the level score to help us tell a useful story about a particular performance, without claiming to measure “lenghtweight”. This is accomplished by the rigorous separation of structure (level) and content.

*If we described objects in terms of their lengthweight, an object that was 10 inches long and 2 lbs could have a lengthweight of 12, but so could an object that was 2 inches long and 10 lbs.

, , ,

No Comments

Promoting development

There is a vast literature exploring ways to promote development. Much of this literature focuses on speeding up development, some of it focuses on optimizing development. Although both approaches are intended to support development, there is evidence that approaches focused on optimizing development are likely to do a better job. This is because development involves two intertwined processes, differentiation (broadening and deepening knowledge) and integration. In plain(er) English, you get more adequate integrations at each level if you accomplish rich differentiation at the prior level.

When we code an assessment, we pay close attention to the degree to which the test-taker elaborates each of the sub-skills it targets. In our personal feedback, we note areas of strength and areas that appear to require further growth. The basic idea is to bring all of the sub-skills up to an optimal level of elaboration to support the emergence of next-level integrations.

Most of the readings we suggest are targeted one to two phases (1/4 to 1/2 of a level) above the level of a given performance. This practice has been shown to provide the ideal level of challenge (scaffolding) for optimal growth. We also suggest activities like engaging in discourse with peers, journaling, cultivating a habit of reflection, and improving metacognitive skills, all of which provide support for growth.

We do not teach people to think at higher levels. Higher levels of performance emerge when knowledge is adequately elaborated and the environment supports higher levels of thinking and performance. We focus on helping people to think better at their current level and challenging them to elaborate their current knowledge and skills—including the not-so-sexy nuts-and-bolts knowledge required for success in any context.

, , , , ,

4 Comments

Task demands and capabilities

Our developmental assessment system, called the Lectical Assessment System (LAS), can be used to score (a) the performances of persons and (b) the task demands of specific situations/contexts. For example, my colleagues and I have analyzed the task demands of levels of management in large organizations, and tested managers’ developmental level of performance in several skill areas—including reasoning about leadership, reflective judgment, and decision-making.

The figure above shows the relation between the task demands of 7 levels of management and the performance levels of managers occupying these management positions. In this oversimplified image, the task demands of most management positions increase in a linear fashion, spanning levels 10-13. The capabilities of managers do not, for the most part, match these task demands.

This pattern is pervasive—we see it everywhere we look—and it reflects a hard truth. None of us is capable of meeting the task demands of the most complex situations in today’s world. I’ve come to believe that in many situations our best hope for meeting these demands is to (1) work strategically on the development of our own skills and knowledge, (2) learn to work closely with others who represent a wide range of perspectives and areas of expertise, and (3) use the best tools available to scaffold our thinking.

, ,

4 Comments

About measurement

The story of how measurement permits scientific advance can be illustrated through any number of examples. One such example is the measurement of temperature and its effects on our understanding of the molecular structure of lead and other elemental substances.

The tale begins with an assortment of semi-mythical early scientists, who agreed in their observations that lead only melts when it is very hot—much hotter than the temperature at which ice melts, and quite a bit cooler than the temperature at which iron melts. These observations, made repeatedly, resulted in the hypothesis that lead melts at a particular temperature.

To test this theory it was necessary to develop a standard for measuring temperature. A variety of early thermometers were developed and implemented. Partly because these early temperature-measuring devices were poorly calibrated, and partly because different temperature-measuring devices employed different scales, the temperature at which lead melted seemed to vary from device to device and context to context.

Scientists divided into a number of ‘camps’. One group argued that there were multiple pathways toward melting, which explained why the melting seemed to occur at different temperatures. Another group argued that the melting of lead could not be understood apart from the context in which the melting occurs. Only when a measure of temperature had been adequately developed and widely accepted did it become possible to observe that lead consistently melts at about 327º C.

Armed with this knowledge, scientists asked what it is about lead that causes it to melt at this particular temperature. They then developed hypotheses about the factors contributing to this phenomenon, observing that changes in altitude or air pressure seemed to result in small differences in its melting temperature. So, context did seem to play a role! In order to observe these differences more accurately, the measurement of temperature was further refined. The resulting observations provided information that ultimately contributed to an understanding of lead’s and other elements’ molecular structure.

While parts of this story are fictional, it is true that the thermometer has greatly contributed to our understanding of the properties of lead. Interestingly, the thermometer, like all other measures, emerged from what were originally qualitative observations about the effects of different amounts of heat that were quantified over time. The value of the thermometer, as we all know, extends far beyond its use as a measure of the melting temperature of lead. The thermometer is a measure of temperature in general, meaning that it can be employed to measure temperature in an almost limitless range of substances and contexts. It is this generality, in the end, that makes it possible to investigate the impact of context on the melting temperature of a substance, or to compare the relative melting temperatures of a range of elemental substances. This generality (or context-independence) is one of the primary features of a good measure.

Good measurement requires (1) the identification of a unidimensional, content and context-independent trait (temperature, length, time); (2) a system for assessing the amount of the trait; (3) determinations of the reliability and validity of the assessments; and finally (4) the calibration of a measure. A good thermometer has all of the qualities of a good measure. It is a well-calibrated instrument that can be employed to accurately and reliably measure a general, unidimensional trait across a wide range of contexts.

It was this perspective on measurement that first inspired me to try to find a good general measure of the developmental dimension. To read more about how this way of thinking relates to the Lectical Assessment System (LAS), read About Measurement on the DTS site. Pay special attention to the list of things we can do with the LAS.

, ,

No Comments

Testing as part of learning 2

I can’t help it, I’m a developmental psychologist. I’ve been lurking about, watching my Granddaughter, Erwin, as she learns to master her environment. She’s about 8 months old now (real age, she was three months premature, so her birth age is 11 months)

Last week, Erwin figured out that complex actions can be used intentionally to make things happen in social situations. For example, she started reaching toward her Mom and Dad to indicate her intention to be picked up. At around same time, she began pointing to objects to indicate interest or draw them to the attention of her others. And she has begun to imitate actions like waving, clapping, and head shaking. Today, when we were Skyping, she clapped her hands to get me to play pat-a-cake, and she shakes her head to get her Mom to do the same—which she finds hilarious. To Mom’s dismay, Erwin is so excited by this new way of influencing her environment that she has stopped napping.

To see an example of Erwin’s attempts at verbal communication and her new reaching behavior, double-click on the picture below. Notice how emphatic her arm extension is, and how she makes eye contact as she reaches out.

A few months ago, most of Erwin’s actions were aimed toward physical mastery—learning to obtain ojects and manipulate them in a variety of ways, learning to move herself toward things she wanted to manipulate, or playing with sound just to hear the results.

When she was learning to do physical things, the physical environment provided most of the feedback. Although her parents were there to give encouragement, we all had the sense that it was the physical feedback that she craved—getting an object to her mouth, inching toward a favorite toy, pulling herself to stand.

Now she craves feedback from her parents; she has shifted her focus from physical mastery to social mastery. She reaches for Mom and gets picked up. She shakes her head and Mom shakes her head back. She points to a banana, and Dad brings it to her. She claps her hands, and Grandma plays pat-a-cake. And every time she undertakes a new action, she is conducting a test.

Testing is part of learning.

Each time any infant tries out a new skill, she is conducting a test. Each attempt is part of an action-feedback loop. Repeated attempts to master a new skill form a series of these action-feedback loops. Each iteration is an exemplary test—in the sense that it is educative—that guides the infant incrementally toward a new level of mastery.

Interestingly, infants never tire of this kind of testing, even when the feedback is not instantly gratifying. In fact, much of the feedback is along the lines of, “almost, but not quite,” or “that didn’t work,” neither of which seem to get in the way of infant learning. For example, when Erwin first started reaching toward her parents to ask to be picked up, her action was not easy to read. It rarely got the desired response. She gradually learned that the reaching needed to be clearly directed toward the parent and accompanied by eye contact. Now the message is, “You’ve got it!” At this point, Erwin takes the skill for granted, and has shifted her attention to things she has not yet mastered, like figuring out how to get adults to do other interesting or gratifying things.

The natural action-feedback mechanism of infancy works perfectly, because the proverbial carrot is usually, due to the very nature of normal human environments, dangled at just the right distance. Good parents respond to early attempts at communication, rewarding them with interesting responses, but success isn’t the only reward; it’s always accompanied by a new “carrot”—another interesting possibility just beyond the infant’s reach. In this way, the action-feedback mechanism functions both as an aid to learning and as a motivator.

Aspects of this “carrot-and-stick” perspective on learning have been expanded and described in a variety of research traditions—e.g., as part of the notion of reinforcement feedback in social learning theory (Bandura, 1977), as zone of proximal development in Vygotsky’s (1986) work, and as part of a complex process of assimilation and accommodation in Piaget’s (1985) work. It is important, because it speaks both to how we learn and to our motivation for learning. Good feedback plays two essential roles. First, it helps the learner decide what to try next. Second, it motivates the learner to keep striving toward mastery. And, as the infant example suggests, feedback cannot be reduced to simple reward or punishment. Ideally, it is information that supports learning by being useful to the learner. Learners are not motivated by reward or punishment per se, but by an optimal combination of “not there yet” “almost” and “you’ve got it”.

DiscoTests are for learning

Most of today’s tests provide feedback in the form of rewards (good grades, advancement, or honors) or punishment (bad grades and failure). My colleagues and I don’t find this acceptable, so we’ve created a nonprofit called DiscoTest. The overarching objective of the DiscoTest Initiative is to contribute to the development of optimal learning environments by creating assessments that deliver the kind of educative feedback that learners need to learn optimally. DiscoTests determine where students are in their individual learning trajectories and provide feedback that points toward the next incremental step toward mastery.

I’ll be writing more about DiscoTest in future posts. For now, if you’d like to know more, please visit the DiscoTest web site.

, , ,

No Comments

The SOI and the LSUA, part 1

The Subject-Object Interview (SOI) and the Lectical™ Self-understanding Assessment (LSUA)

Before I write about the relation between Kegan’s SOI and the LSUA, I want to clarify some differences between these assessments. First, the SOI is both an interview and an assessment system. It was developed by studying the interviews of a small sample of respondents (Does anyone know how many?) who were interviewed on several occasions over the course of several years (Again, does anyone know how many or how often?). The level definitions and the scoring criteria in the SOI are tied to the subject matter of the interviews in the original sample (construction sample). For this reason, the SOI is called a domain-specific assessment. Researchers would say that the levels were defined by bootstrapping from the longitudinal data. Critiques of this kind of assessment point to bias in their level definitions (due to their small and culturally narrow construction samples), the related conflation (confusion) of particular conceptual content with developmental levels, and a weak articulation of the lowest levels, which are not based on direct empirical evidence from appropriate-aged respondents.

With respect to the LSUA, I want to clarify that it is scored with the Lectical Assessment System (LAS), a content-independent developmental scoring system that was created, in part, by identifying the dimension that underlies all longitudinally bootstrapped developmental assessment systems*. The SOI was one of the assessment systems I studied on the way to developing the LAS. Consequently, if the LAS does what it is supposed to do, it should capture the developmental dimension that underlies Kegan’s system even better than his scoring system, because the LAS is a second generation developmental scoring system that is not restrained by a content-driven scoring process (Dawson, 2002; Dawson, Xie, & Wilson, 2003: There is much written about this in our published work, available on our web site.)

What is the relation between the LSUA and the SOI?

This is a difficult question to answer, partly because there is no research that directly compares the SOI and the LSUA. However, because the LAS is a domain independent scoring system that can be used to score any text that includes judgments and justifications, I have used it to score the SOI scoring manual. The developmental sequence for SOI levels 3 to 5 corresponds well to the dimension captured by the LAS. However, Kegan’s lower levels do not match up as well, probably because his construction sample (the sample used to define his levels), as far as we can determine, did not include young children. [Kegan's original research was never published in a form that would allow us to evaluate the approach he took to defining his levels or the reliability and validity of the SOI. All we can locate are a few very small studies of inter-rater reliability, most of which are unpublished (Kegan, 2002).]

Comparisons of the SOI with other developmental assessment systems

There is some research comparing the SOI with other developmental assessment systems. In general, this research finds that the SOI and these other systems are likely to tap the same developmental dimension (see Pratt, et. al., 1991).

Ideally, we would like to conduct a direct comparison of the LAS and the scoring system Kegan developed to score the SOI, as we have done with other developmental assessment systems. (We are working with a graduate student who is planning do do this kind of comparison, and should have some results in a year or so.) In the mean time, we can point to comparisons between the LAS and several other developmental assessment systems (Kohlberg, Armon, Kitchener & King, Perry) that were developed using methods similar to those used by Kegan, and have routinely found strong correlations (above .85) between these scoring systems and the LAS, especially when they are used to score the same material (Dawson, 2000, 2001 2002a, 2004; Dawson, Xie, & Wilson, 2003 ).

Finally, some of Kegan’s level definitions are almost identical to those of Kohlberg and Selman. In fact, I would argue that they are primarily an extension of Selman’s original work on socio-moral perspective, which has informed most domain based developmental assessment systems (including all of the systems mentioned here) since it was introduced in the 1960’s (and was a great help to me when I was developing the LAS).

*The claim that there is a single developmental dimension that underlies these systems is NOT the same thing as a claim that an individual will be at the same level in different knowledge areas (or on different lines).

References

Commons, M. L., Armon, C., Richards, F. A., Schrader, D. E., Farrell, E. W., Tappan, M. B., et al. (1989). A multidomain study of adult development. In D. Sinnott, F. A. Richards & C. Armon (Eds.), Adult development, Vol. 1: Comparisons and applications of developmental models. (pp. 33-56). New York: Praeger Publishers.

Dawson, T. L. (2000). Moral reasoning and evaluative reasoning about the good life. Journal of Applied Measurement, 1(4), 372-397.

Dawson, T. L. (2001). Layers of structure: A comparison of two approaches to developmental assessment. Genetic Epistemologist, 29, 1-10.

Dawson, T. L. (2002a). A comparison of three developmental stage scoring systems. Journal of Applied Measurement, 3, 146-189.

Dawson, T. L. (2002b). New tools, new insights: Kohlberg’s moral reasoning stages revisited. International Journal of Behavioral Development, 26, 154-166.

Dawson, T. L., Xie, Y., & Wilson, M. (2003). Domain-general and domain-specific developmental assessments: Do they measure the same thing? Cognitive Development, 18, 61-78.

Dawson, T. L. (2004). Assessing intellectual development: Three approaches, one sequence. Journal of Adult Development, 11, 71-85.

Kegan, R. (2002). A guide to the subject-object interview. Unpublished Scoring manual. Harvard Graduate School of Education.

King, P. M., Kitchener, K. S., Wood, P. K., & Davison, M. L. (1989). Relationships across developmental domains: A longitudinal study of intellectual, moral, and ego development. In M. L. Commons, J. D. Sinnot, F. A. Richards & C. Armon (Eds.), Adult development. Volume 1: Comparisons and applications of developmental models (pp. 57-71). New York: Praeger.

Lambert, H. V. (1972). A comparison of Jane Loevinger’s theory of ego development and Lawrence Kohlberg’s theory of moral development. University of Chicago, Chicago, IL.

Pratt, M. W., Diessner, R., Hunsberger, B., Pancer, S. M., & Savoy, K. (1991). Four pathways in the analysis of adult development and aging: Comparing analyses of reasoning about personal-life dilemmas. Psychology & Aging, 6, 666-675.

Sullivan, E. V., McCullough, G., & Stager, M. A. (1970). A developmental study of hte relationship between conceptual, ego, and moral development. Child Development, 41, 399-411.

, ,

No Comments