Most of us, who work in the line of evaluation, bear in mind and remind ourselves and others about validity and reliability of measurement / evaluation tools. When I design a study, I always think how to triangulate the collection of data, and use more than one system to measure the subject in question. Therefore, most of us will usually use several questions to measure the same indicator (and then conduct a reliability test); and ensure the test and indicators are actually measuring the topic we would like to learn about. However, I never put too much thinking about other tools and research instruments that are perform other types of measurements, i.e. Polygraph. What polygraphs are entitled to do, is to provide the researcher/authority with some information that in general is considered more credible than just another statement or testimony given by the participant. Much research was and is done on this regard, and it is widely known that polygraphs are not too credible or reliable tools to assess whether the participant tells the truth (a not so credible way to assess credibility!). This arises several questions:
(1) why is it s widely used, while known to be less reliable than the average person would expect this to be (not to mention experts)?
(2) why did humanity could not come up with a better, more reliable, more valid solution so far?
(3) what are the consequences of using such a tool on human rights, dignity, and justice?
(4) how can we improve the tool, or suggest a better tool, or at least suggest a tool to triangulate and validate polygraph findings?
It appears those questions are high priority these days, and there is a competition in the US, focusing on “Credibility Assessment Standardized Evaluation (CASE)“. This Prize Challenge offers five prizes to teams and individuals who will suggest fruitful tools to asses and standardize evaluation process for credibility tools. In their words: “The CASE Challenge is … to develop credibility assessment evaluation methods that can be used to objectively evaluate both existing and future credibility assessment techniques/technologies”.
Registration is open now, and those who will present winning solutions will be invited to Washington, DC in summer 2019. I am highly curios to learn what options we have to create a better, standardized evaluation, especially focusing on intended future behaviour. A reliable, standardized solution may be duplicated to other areas such as program evaluation in education and social services.
In the following series of posts, I will introduce, review, discuss, explain, and examine the wicked topic of systems, by integrating several disciplines of knowledge (such as physics and sociology).
I will talk about actual consequences and implementations related to the social element in our life. By “social” I refer to the knowledge, perception, and practice about the human society. The materials I use are taken from academic books, lectures, YouTube videos, peer reviewed journals and so on.
The basic of systems thinking is that there is interrelationships between everything to everything. In simple words, what goes around comes around. This is not random this expression is using the round shape. It is circular. Think for a moment about earth, rounding around itself, rounding around the sun, other stars rounding around on the same system, the milky way within the galaxy, the galaxy within universe, and so forth. Step back, and see the same pattern for humanity. A child was born, circled by family, circled by broader family, circled by neighborhood, city, country, human beings, etc. now, add more components (let’s call them “agents”) to the system. Wild life, weather, forests, energy. Another frequent example is to imagine cloudy sky. We automatically know it’s going to rain, and then the sky will be blue again.
Russel Ackoff (2000) explains what is a system: “A system is whole which cannot be divided to parts, the system is dependent on how the parts interact, not how parts act alone. An example: life, our body; part of cars”.
We live in a never-ending system. In effect this is an infinite system, which is interrelated, and every agent is affecting other agents, and those relationships cause a dynamic change of the system. In fact, we are part of this complex dynamic system, and what we do is undoubtedly affecting other agents, in so many ways, shapes, and variations, but we tend not to see it, because we are not used to it, we think linearly. We see straight lines. Cause and effect. A straight arrow from A to B. A led to B. we are having a hard time to internalize the obvious fact that it is a circle.
This may sound weird to you, but before getting familiar with Physics (especially quantum theory and information theory), and systems thinking in general, I was doubtful and considered myself a woman of facts and strong reality, with a special affection to multiple linear regression.
Therefore, repeatedly, Peter Senge introduced this topic of systems thinking in his book “the fifth discipline”, and in other lectures on (2014), with the very clear statement that Gut and heart are fundamental for every process of effective learning and action; and leader are the key. We firstly grasp it from the heart, then we translate it to thinking.
Ricardo Valerdi (2011) is convinced that system thinking is not a natural act. He explains that interruptions distract us from what we are doing; our brain is limited to boundaries of complexity and dynamics. As an example, he mentions Dan Ariely – “predictably irrational” book – on how people tend to wrong decisions because of abilities limitations.
In summary, systems thinking is not natural to us, we are not used to it, our school systems educate us to think linearly, however once we start seeing the patterns, interrelationships, grasp the complexity as a life fact, it is a matter of time, until we view the world utterly differently. Moreover, with enough practice, we can leverage our potential to achieve much better results, and avoid repeating problems.
As Senge says in his book “the fifth discipline” (1994): most of the problems we face as a humanity, reflect our inability to grasp and internalize complex problems.
Most of the social / educational program are evaluated this way or another, and on this post I would like to focus on repeated measures of the same group of participants or individuals, as opposed to different groups comparisons or tests.
In many occasions we want to learn what is the impact of an intervention on attitudes, perceptions, and behavior; and by this we want to isolate the impact of the specific intervention, hence the program, and see how it changed the attitudes, perceptions, behavior, or a specific situation; in order to infer whether the intervention was effective or not.
Many of us will conduct t-test or repeated measures test. Another common way to investigate those questions is using a linear regression model; and by this try to predict the change on our dependent variable by a series of controlled variables. However, here comes the “catch” –
“Regression to the mean (RTM) is a statistical phenomenon that can make natural variation in repeated data look like real change. It happens when unusually large or small measurements tend to be followed by measurements that are closer to the mean.”
( Barnett et al., 2005)
The problem (RTM) may occur whether we measure an individual or a group, due to the random error (within-subject variance and between-subject variance).
A similar problem is identified as “a standard error of measurement(SEM), which refers to the standard deviation of an individual’s observed scores from repeated administrations of a test (or parallel forms of a test) under identical conditions”
(Koizumi et al., 2015)
The problem: variations in data sometimes DO NOT reflect a real change, but a correction of a previous random error.
In other words, we jump too fast to define a correlation as a causation, without checking carefully it really is!
Indeed, research conducted to investigate these measurement errors in social implications shows that many changes are accounted for RTM or SEM, and do not reflect a real change (Marsden amd Torgerson, 2012; Koizumi et al., 2015).
Solutions and Food for Thoughts:
Be careful when you aim to predict something. Do not assume a vacuum. On the contrary, plan the study cautiously and take into account alternative explanations, and different routes for interpretation. In fact, there is some good advise on how to reduce the chance your study’s results will be affected by natural errors such as RTM.
assign participant randomly for all groups
make sure groups are the same size
always include a control group
control for alternative variables
use tools with high reliability
control for background variables and context
Data Collection and Analysis:
conduct more than one pretest
collect two or more baseline data
control for baseline average / st. dev. by adding the group mean to the equation (either on regression or Ancova)
(Koizumi et al., 2015; Bonate, 2000; Marsden and Torgerson, 2012)
Implications on Program Evaluation
Many social and education program seek to change an attitude or perception, and assist participants in gaining knowledge of certain areas (such as financial literacy or second language).
Evaluation for these program usually focuses on perception measurement using a before-after design. Most of the time, RTS is not taken into account, and therefore interpretation of program impact may be wrong. Needless to say, designs without a “before” measurement worth NOTHING in terms of explaining program impact or change. In addition, there is a second aspect to emphasize which is the presence of a control group. Very often it is very difficult to compose a group of participants just for the sake of evaluation; however you should take into account that if you do not do it, you will never be able to correctly assess neither a baseline nor a change in your group of study.
In short: be cautious, plan and conduct evaluation carefully, when bearing in mind that a change in attitudes, perception, behaviour or knowledge, can be explained by a variety of explanations, that may be slightly different than the intervention you evaluate.
Subscribe if you liked (:
…and feel free to contact me regarding program evaluation consulting projects
Barnett, A.G., Van der Pols, L., Dobson, A. (2005). Regression to the mean: what it is and how to deal with it. Int. J. Epidemiol. , 34 (1):215-220.
Bonate, P. L. (2000). Analysis of pretest–posttest designs. Boca Raton, FL.
Guisasola, J., Solbes, J., José-Ignacio, B., Maite, M., Antonio, M. (2009). Students’ Understanding of the Special Theory of Relativity and Design for a Guided Visit to a Science Museum. In: International Journal of Science Education 31(15), 2085-2104
Koizumi, R., In’nami, Y., Azuma, J., Asano, K., Agawa, T., Eberl, D. (2015). Assessing L2 proficiency growth: Considering regression to the mean and the standard error of difference. Shiken, 19(1).
Marsden, E., Torgerson, C. J. (2012). Single group, pre- and post-test research designs: Some methodological concerns. Oxford Review of Education, 38, 583–616.
Ahhmmm…. got to tell you I am excited.
In the past weeks, I have had the opportunity to meet many professionals virtually, especially thanks to our lovely brother LinkedIn. I found myself in a process of learning, that I missed so much! well, enough with the introductions, let’s start thinking (-:
I would like to take it one step ahead, and correct me if I am wrong, the operations-minded people are usually dealing with something we can touch, or at least can see impact simply just by looking what the organization does – water, agriculture, farms, vaccines, you name it. The impact measurement in this type of organizations is a short and sweet ROI analysis. Efficiency, effectiveness, benchmark, and performance measurement are relatively easy to conduct, as well as goals and objectives setting.
Another important suggestion was to collaborate and cooperate, and even amalgamate organizations that do the same/similar work. I agree with this way of thinking but would like to ask you managers – will you work on your ego and let another organization to work with you or instead of you? Think about it. If it works – there may be great, extremely successful models of business supply chain management, which employ this attitude – more for less – by a chain of organizations. Phrase it like this: The bottom-line is the VALUE profit. As long as there are organizations which may do the same, or even better than your organization – the real impact will be achieved by working together, and collaborate. It also saves you money, and reduces costs!
I know it is the hard part of managers’ ego, but probably one of the realistic ways to save the sector alive in terms of impact.
In short, the question to ask yourself is this – What is your organization’s unique selling proposition? What is your competitive advantage on others? What is the special value your organization creates? You supposed to have a very good answer to this question.
An additional note in this regard, is Collective Impact. Someone referred me to the collective impact website. I researched it, and browsed the web, and I must compliment them for doing the first operations step – creating coalitions and collaborations in order to increase the impact. However, this is only the first necessary step. The sector is getting shrunk, and will continue this trend, therefore there is a vital need to amalgamate or eliminate ineffective programs, not just collaborate.
The Social Value Impact Aspect
The problem of impact comes to life again when we deal with the social hot potato, and you know what?! I am dealing with it!
There is certainly a broad agreement on the need, although I must admit that I am still shocked to see huge foundations refuse to measure themselves (ego and power issues??)… But let’s put it aside. Just another small reference – social enterprises are a relatively small part of the nonprofit sector. I will never call them the “forth sector”, because they are not. I will never agree that they act differently than the third sector, they are value-driven, and this is the crucial aspect. They DO NOT care about money more than value, and DO NOT care about profit and value the same way. Therefore they DO NOT have double bottom-line, but one, and the last is very similar to the pure nonprofits. I can count few real social enterprises, but they will be again the ones we can touch – bakery, restaurant, agriculture, cafe, and the like.
Show me one social services enterprise… it ain’t exist, because it is impossible, and here comes the social impact measurement to help us.
I did not define Standardization last time, so here it is: in my opinion is needs to be simple. The metrics should include up to 4 core elements, which will be relevant across the sector. By this, there will be an option to compare between one organization to another. There is an option to add as many as indicators you like and want, and it won’t harm the metrics, but will give your organization the specific information you are looking for.
So, this is my review, happened somehow to be very small and narrowed…
I rank tools in 1-3 scale. 1=low, 2=medium, 3=high. Hence, the highest total score is 9.
I try to keep it as simple as possible, so do not rank 1-10 or something like that.
An important note! If your organization does not have a work plan which included vision, goals, and objectives, you cannot employ social impact measurement at this time. You MUST define the above in advance. Do not know how to do it? Drop in my post on setting goals, and keep up the good work!
SROI I love this measure, however and in short, this does not apply in many social services and education nonprofits. If you are dealing with employment or any other outcomes which involve money, this may be the measure for you. Usefulness (1): Applies to a narrow type of organizations (usually employment services, financial assistance, micro finance, and similar) Friendliness (1) If you do not learn it thoroughly, and gain lots of knowledge – you probably won’t be able to conduct a reliable SROI analysis Standardization (2) This part gets high score, because lots of research has been done, however there is no option to apply it broadly enough. Total score: 4/9, 44%/100%
GRI / IRIS I like the business attitude. This metrics will not save your life, but definitely will give you a way to benchmark your organization. This tool is used by many for-profits in order to monitor their performance, so I would rank it as the following: Usefulness (1): it gets a low score here, because I am not sure how it is going to help many organizations in their day-to-day management in terms of measuring social outcomes. However some organizations might fall under the suggested social objectives, so I suggest to check it out. Friendliness (2): the tool seems to be highly recommended and highly used by a variety of organizations. It does not gain the 3 points, because it does not fit every organization. Standardization (2): you win the entire pot here. The tool is absolutely standardized, and you may feel free to compare your organizational performance to similar organizations in the industry. It is a huge advantage. I ranked it 2, because it does not apply in every field. Total score: 5/9, 55%/100%
Social Impact Bonds (SIB)
I like the idea of social finance, because it makes much sense. It really builds a reputation for impact investing. In short, the system is designed to invest money in social projects, in order to PREVENT problems from reoccurring in the future (such as second-generation issues, recidivism, unemployment in specific sectors, etc.). The model briefly works by this: funding is given > intervention is made > evaluation of outcomes is conducted > in case of success (i.e. less recidivism, more employment) the government returns money to the investors. Even though I like the idea, I have no clue regarding the metrics and indicators they are using in order to evaluate social programs… it does not seem standardized or friendly, but it is just my outsider opinion.
Moreover, in my opinion, the social impact bonds model seems to fit to a narrow type of outcomes, kinda similar to SROI. Total score: Unknown!
My list is much shorter than expected. I reviewed over 10 tools and methodologies, but did not like them at all, so why to mention them?
With that said, when it comes to social services and education, and other soft, hard-to-touch outcomes, the measures and indicators become useless. No usefulness, no friendliness, no standardization. Nothing helps. Therefore we must agree that there is another way to measure, and you know what?! it is not the kind, the gentle one… it is about achieving your objectives, measuring your VALUE. It is that simple.
Your objectives include “improvement of students’ grades”? Show the improvement, between the beginning of the year and the end. Your objectives include “women’s empowerment”? Define what empowerment is, let’s say, they will be more responsible for their day-to-day tasks. Show they have changed their behavior/attitudes.
Do not use excuses like “they are happy”, “their self esteem is higher”, “I feel it”… these are NOT your objectives, and therefore not the social value you wanted to create.
My post that dealt with the lack of measurement in success terms has led some enlightening comments from my colleagues and past managers, so I have decided to dedicate my coming posts to research this field and dig a bit more. It should be noted that I have done some research in the past, and sort of consider myself as someone who gained some knowledge re SROI or other impact measurement efforts, however insufficient and this is why I find it imperative to research now.
Another note before we dive-in… I plan to write a series of posts in this topic. Firstly, I would like to cover some methodology, i.e. set some conditions for my research. Secondly – review existing solutions in respect to the methodology. Finally, I hope to come to conclusion with the most relevant tools or suggestions for the future.
Ready to think?! Let’s do it!
The first and above all is the question WHY do we need to measure impact in social projects? It is indeed a vital and important question, and… I have a very good answer in my pocket. A very smart senior executive and philanthropist, emailed me the following statement, based on his extensive experience with foundations: about 85% of the funding in social projects is going lost without achieving its goal. These are insane numbers my friends. A simple math will reveal a bare truth. In the US itself ~333.5 billion dollars were donated in 2013. However, based on the above, we can cut out about $300 billion. I stop here, because it hurts to think in global terms (not to mention the lovely governmental “match”). I am sure most of you already know that there is a critical issue with funding-impact ratio, and this statement is just the straw to break the camel’s back. It was for me, anyway, and as a consequence I have decided to write a wake-up and start thinking post.
Are you ready to think with me?! I suggest you to comment here or by email, and I promise to integrate your thoughts and credit you in my next posts. I honestly believe that we can work together in order to achieve this goal, as we all have one mission – to find a decent solution. But! we cannot accept every solution. I have developed several criteria in order to consider a route to be a solution, and you are welcome to add more or suggest adjustments:
1. Usefulness To what extent can we use the measures in day-to-day management? and by this I talk about the dual role of metrics. I hold this opinion for years, and every time I state it, people are looking at me if I had fallen from the moon. But I actually did not, as long as I recall I was born on earth (-:
So if I get back to the point – the dual role is enabling the use of metrics/measures by both sides, the foundation/funding body and the charity/organization alike. No more measurement for THEM, no more shortening of evaluation time and tools. You want it for your organization because YOU deserve to know what the hell is going on!
The issue of usefulness hit me like crazy back in 2007, when I was working with senior managers in social services, whom their project was founded by a very large north-American federation. They did not want to measure, nor to learn – they already knew the true, thanks, but no need to measure at all. I asked them why they despise it so much, and they simply said – it is too much work for our overloaded staff, we have no spare time for collecting useless data… they added that they already know the ins and outs so well, so no evaluation or measurement will enlighten anything.
It hit me again when I read Jed Emerson’s post last week, especially here: “I recall a breakout session presentation by one of the world’s best known impact investment organizations—one that appears on everyone’s list of favorite impact funds—listening to a nuanced and well-considered presentation by their head of impact performance. Following the formal PowerPoint show that included impressive definitions, charts and data, the presenter was asked, “How do these impact metrics inform your work?” to which the presenter responded, “Oh—we don’t actually use these metrics in our work. We just need them to give our funders!” After everyone had a good laugh, he said, “No, no—I’m serious—we don’t use them at all!” Peals of continued laughter echoed…”
Things are about to get worse, my dear readers, not a long ago I met a very senior level manager who works for a wealthy foundation. It happened to be with some connection to one of the projects in which I have been involved in the past, so I was more than curious to learn how they used the data and metrics we sent then for review. Sorry to disappoint you – they did not. They donated the money and forgot about it, and the real hell is yet to come – they never use metrics for themselves, as a foundation… and you know what?! they are absolutely not alone. The same shock made me shake, back in 2011 when I realized that a billion dollar foundation NEVER measures impact, never tried to develop something useful, and moreover when I was trying to educate them I got the same feeling of Mars and Venus, Earth and Moon, whatever you name it, I bet you understand me.
If so, I conclude: we need something useful – something that people will want to use, need to use, and feel it’s helpful. And please twice, one for the giving side, one for the taking one. I would say that in this aspect we have to talk business, and learn from business cases how to measure ONCE and EFFICIENTLY for more than one stakeholder.
To what extent normative managers can use the measures in day-to-day management? and by this I talk about managers who do not have extensive research background.
It reminds me one of my nonprofit job interviews (-: funny story actually, especially if you consider my very limited knowledge about this sector back then, and specifically the funding issue. The position was highly customer-faced, and dealt with measurement and evaluation in education and social services. The interviewers, there were two of them, asked me if I think that every social service manager or every school principal should use SPSS (yes, the statistical package). I was sure they are kidding, but they did not, so I answered “of course NOT, they can use Excel which is much easy to adapt, use and learn”, and got the job.
I tell you this story, because I will never agree that managers need to be researchers. They certainly do not! they need to do their job, to manage! and metrics is just another great tool to facilitate decision making and performance measurement. Yes, it is an essential tool, one to be top prioritize for every manager in the nonprofit sector. Yet, I wonder how this vital tool will become friendly and suit every normative manager? I think we can again keep an eye on good business practices. We also may bear in mind that we want something simple, Excel based, and easy to collect data and interpret.
I have to warn you, I got together with enough organization that invested tons of money in IT solutions for their database, and sadly I can barely count the organizations that really USE this information. They often tend to forget about it, and when they tried to retrieve some data, it was always such a burden, and poor quality.
I would also like to add one more thing, I believe in short things. What do I mean? I never like the idea of tons of questions or gathering endless information, we are not conducting an academic study (NO we are not!), we have a mission to measure the impact. In order to do that we have to keep in mind that there are busy people who realistically cannot dedicate themselves to information gathering as their life mission, and therefore it has to be short and useful, not just short, not just useful. BOTH.
If I summarize my impressions and thoughts in this respect, and would say that there is a vital need for a FRIENDLY and RELEVANT tool to measure nonprofits performance. I do not want to sound as criticizing some existing measures, but I feel they are too complicated, and may not be a good answer to address the friendliness need.
3. Standardization: to what extent we can expect to use the same measurement for a wide range of projects? I will leave this measure open for your comments, because I already, kind of, formatted this so firmly in my mind, and I feel too strict…
Hope you enjoyed the reading. Please comment, share and subscribe to my newsletter, I promise to make you think (-:
P.S. – A super-talented friend of mine who works in the industry and does a great job with impact an so on, read this post before published, and was quite amazed that I still insist to make efforts to spread these ideas. He said: “You have got to forget SROI and this, no one wants to learn, no one wants to know, they know everything. Even when I try to spoon feed them they won’t listen”. Well, I dedicate couple of songs to you, keep calm and continue doing your great job. Good times are just about to come (-:
Click here if you want to listen the songs. None of them fully expresses my purpose, but absolutely set the tone.
I am in love with the not-for-profit sector, I switched around 2006 and almost never look back. I love the idea of helping people, doing good to others, support the world, and make my living as a by-product.
However, I must say that I feel a bit worried. While many great things are happening around, the nonprofit sector seems to stay aside and wait. Wait for what?! you do a great job, however you must understand – there is an essential need to measure your performance. Do not get me the wrong way – YOU need it. not your boss, certainly not the funding organization, not your manager, and not your board of directors. You need it, and you deserve to know what works and what not.
I do believe that not-for-profit is not for profit, but if we are so confident in that – let’s ask: what are you there for? yes, I get you. You are there to help, to support, to facilitate, to give a hope. I agree, I am absolutely with you. However… how do you know that you really help, support, facilitate, etc.?
Let’s get straight to the point – you need to measure. You have to measure your impact, to benchmark between projects, to justify budgets. You cannot just do the “on the surface” stuff like counting and have nice graphs. Because the question: “how many people used our service in the last 12 months?” is a very good question, but what about the impact? what about the help, support, facilitation and so on? The question: “how many people were satisfied from our service?” is a neat question, however does not provide with any insight of your communal impact.
The questions, mistakenly perceived as unimportant, irrelevant, or annoying, are dealing with your impact. This impact can be evidently measured and give you a very clear picture of what works and what not.
Ask yourself – What is the difference between projects? Why project X and not project Y? Why project X must continue? What project holds the highest SROI? What project has got significant impact on your client’s life?
If you do not have answers to these questions – you better start to work on your impact measurement very soon.
The process of impact measurement will be described in my future posts, so stay tuned and feel free to subscribe to the newsletter.
The bad old habit, yeah! think, rethink, decide, do it, over and over again.
Stop! Have you ever thought of taking this practice a step ahead?!
I have collected two great examples for us to think about today. Ready to think? let’s do it together.
The first shot is “Moneyball” trailer. The movie is focused on “Oakland A’s general manager… challenges the system and defies conventional wisdom when his is forced to rebuild his small-market team on a limited budget. Despite opposition from the old guard, the media, fans and their own field manager… forever changes the way the game is played”.
The main lesson from this movie, in my opinion, is to think differently on something which is already well defined and learned…
The traditional thinking is being challenged here, and by using stats and analyses, the GM turns to do the smartest things in order to win the big players game.
The second shot is from “Draft day“, “a life-changing day for a few hundred young men with dreams of playing in the NFL, general manager Sonny Weaver goes against advice making a series of risky, unexpected maneuvers to save his team”.
The movie, similarly to the one above, deals with a day of shaking the national football league foundations, when the manager, who appears to be working under extreme pressure, makes so-called “risky” decisions, based on research and planning.
I would like to take these two example, and kindly ask you my readers, to apply a different view in your day-to-day work.
Are you required to report on sales? Plan the budget for next year? Initiate a new program? Monitor and improve processes?
Please, use the numbers. Use them because they are so easy to reach. They are there, waiting for you to put in an Excel sheet or any other package you use… and finally – to use them smartly.
What do I mean by that? I have encountered many managers who knew how to produce stats, but did not know to pick out the wheat from the chaff.
Focus on your goal, and extract the number that enable you to decide smartly. How to focus on your goal? Right on my next post (-:
In short: There are numbers, which may help you to decide smartly and differently. Use them!
P.S. Do not forget to watch the full movies, if you haven’t yet.