It was earlier this year that I got an email from someone in human resources, it was a friendly reminder that performance evaluations were due soon — there would be no tolerance for lateness. There was even some excitement about the launch of a new tool for the process, along with a series of standard relationship destroying questions such as “I would always want <employee name> on my team: Yes or No?”
People who have worked with me for a while know my dislike for the performance evaluation process, both in concept and execution. There has been a lot written about its failure as a process, but my main dislike can be summarized in these two points:
1. The coupling of improvement with rewards mechanisms. Particularly pie sharing activities like allocating bonuses. The improvements almost always taking a back seat to justification from reviewer or reviewee on why the allocation should be so.
2. The relationship destroying nature of the formality; working alongside as partners and then suddenly shifting to formal judgement. Imagine trying to do that with your spouse.
In past years I’ve coped with improving the process in both data capture and delivery, at times even watching this Deming video together with those on my team to help rehumanize it.
This year I was intent to try something new — if I was going to have to do this, let’s see if we can learn something.
At the time I had not yet made the decision to move into independent consulting, and so didn’t realize this might be the last set of performance evaluations I’d ever do. But perhaps it was a bit of foreshadowing, because I ended up having some fun with my team; we all agreed this was the best approach in our collective experience we’d ever had.
Do we really know what is expected of our job?
My team consisted of people at various levels with different titles including “Service Delivery Manager”, “Scrum Master” and “Agile Coach”. And while titles and levels may come with varying expectations, it was my suspicion that even people with identical roles would have entirely different expectations put on them that was highly dependent on their particular context: i.e. the people they worked with, the area of the business, the nature of the work and even previous roles and assignments that they may have had. I wanted a way to put this theory to the test, and hopefully leverage this for actionable information that my team could use.
As a Kanban practitioner, we tend to look at the world through what’s referred to as the “Kanban Lens”. A big part of that lens is to think in terms of services: everything in an organization, be it a tangible product or one-off process, should in some way be recognizable as part of a service to a customer.
At the time I had been applying a new framework called Fit for Purpose (“F4P” Anderson and Zheglov) to how groups could think about organizing their services, and a thought occurred to me “perhaps that concept could be to extended to an individual level.”
Most of my team had already been exposed to F4P in some way, so it was with some excitement I pitched to them a new process: “remember how the F4P process had this concept called a Fitness Box Score? It told you what your customer’s purposes were for a particular service, as well as how well the service fit those purposes and finally why. Haven’t you noticed that those around you expect all sorts of things from you, wouldn’t you like to find out what they really are?” In my mind the response felt like “yeah, let’s do this!” but it was likely something more along with the lines of “worth a try, better than anything else we’ve tried.”
Gathering the data
Each member was responsible for gathering a list of names of people they worked with and send them my way. Each person identified, got an electronic version of a Fit for Purpose Card. In the card were 3 sections:
1. List up to 3 expectations you had of this individual
2. Score each expectation from 0 to 5
3. Explain the reasons for the score
The Group Picture
Just under 100 people were polled with a response rate just shy of 50%. Not surprisingly quite a number of purposes came back for my team — in total 146 purposes were identified.
This confirmed our suspicion going into this experiment, that we would get a very diverse set of expectations. It also suggested from the overall box score, that as a collection of individuals we were mostly fit for the people we worked with. But the diversity of purposes needed to be explored further to get a sense of how truly healthy this was.
I then looked at creating categories for each of these purposes and rank them from most to least identified. This would consolidate some reoccurring identified needs, though expressed in varying ways, along with capturing which were most frequently identified. We then were able to attach fitness scores to each area to get an idea of how fit each purpose was. I was hoping this would give us some form of additional objectivity; perhaps it would be ok to be less fit in infrequent purposes and conversely underscore the need for something to be done for fitness gaps at frequently identified areas.
The results allowed us to start focusing mostly on comments in the most identified categories both to identify why those purposes were fit or why they weren’t.
We also did more consolidation of the purposes to segments, to help us identify if there were opportunities to elevate segments, make them more the focus of the job, and ways we could eliminate unwanted areas as well. We didn’t take any action in that area at the time, but did consider it as a future possibility. Particularly as we had a desire to elevate #3, Service Management, and wanted to reduce the demand for the #1 item, Team Management.
We also found that needs that have come up in previous years had died down naturally as relics of the past, which suggested to us that some of our new approaches to managing work, particularly through Kanban, were evolutionary supplanting previous management practices.
Another angle that was considered was how purposes differed depending on which groups were asked, in this case we saw the highest rates of fitness with team members, but the lowest with senior management. One big point of discussion with my team was the political and safety implications with fitness gaps with the leadership group. It was a market that could not be ignored!
The Individual Picture
Individual results were only shared with the individual they corresponded to. The initial contentious issue were reactions to scores when they were below the scores of the overall group — nobody wants to be considered below average. Part of the guidance from the Fit for Purpose framework was to stop thinking in terms of benchmarking to external sources (i.e. your competition) and focus on yourself and your specific market. In the case of individuals, this guidance continued to make sense.
This thinking was reinforced as we saw significant differences in purposes at the individual level, that made comparison to group benchmarks unhelpful and really a comparing apple to oranges exercise.
Some group segments didn’t show up at the individual level, and others that were less identified by the group were more pronounced at the individual level.
The Road Ahead
The reaction across my team was one of empowerment. They began to use this information to engage with those around them — understand more about what people wanted. In many cases the “why” in question 3 wasn’t always coherent, and this encouraged dialogue between my team and the rest of the company. And allowed them to make individual adjustments to their “micro market.”
For my team, for some time, they had been on a continuum moving away from working within structured job descriptions, so this felt like a natural next step. But it gave us all a bit of validation of a problem we all suspected: trying to work within one-size-fits-all job descriptions did not actually translate to fitness as a collective group, as a person or employee— particularly given how dynamic our business has been changing. Even tactics like group training may not be effective given how much difference we found across groups and individuals.
For our team, our general conclusion was this: Even at an individual level, knowing your market and adjusting to it, was really the only thing that did work when it came to being individually fit at work.