Sunday, August 20, 2017

A Different Class

In Stop Working & Start Thinking (which I also mentioned the other day)  Jack Cohen and Graham Medley want scientists to consider what science is and how they do it, as well as just getting on with it. To help explain this, they partition scientific answer-seeking like so:
  • observation
  • measurement
  • investigation
  • experiment

And that's interesting in and of itself.  But the authors have been round the block and so recognise that this categorisation is not absolute, and that sometimes it might not be clear where a particular activity sits, and that some activities probably sit in multiple categories at different times and even at the same time.

In science — and thinking, so this applies to you too, testers — generalisations are useful because they help us to frame hypotheses at relevant granularities. We’re all made up of atoms but a description of social deprivation in inner cities at an atomic level would unhelpfully obscure, for example, that higher-level concepts such as social class can be useful factors in such an analysis. Further, this can be true despite social class itself being a fluid notion.

But generalisations are problematic for scientists — and thinkers, which includes us testers. It’s easy to be lulled into a false sense of security when, for example, all known observations show one general thing (swans are white) but then a new observation doesn’t fit (it looks just like a swan, but it’s black). There’s a human tendency to want to avoid complicating a simple model, and so reject the data (that’s not a swan!) which doesn't improve the theory.

Scientists, Cohen and Medley say — and I'd add testers also — should retain a critical distance when classifying: what does it mean to be a black swan? Would, say, a white stripe under one wing affect this? Could there be shades of black? Could there be greys? What explanatory power does a given classification permit? What does it prevent?

Further, they add, involving tools in categorisation, or more generally in any measurement, requires another consideration:
... when measurement is automated, the final figure, the one you get from the machine, incorporates the prejudice of the designer, not yours!

Saturday, August 12, 2017

See You Triangulater

Perhaps it's true that there's nothing new under the sun, but that doesn't mean that what's already known is necessarily uninteresting. Here's a quick example: I was recently reflecting on how talking to multiple people about their perspectives, finding data from several independent sources, or asking the same question in different ways felt analogous to a technique from surveying, triangulation.

Triangulation is an ancient but still widespread method of mapping a landscape in which a network of points are plotted in relationship to one another, with each point always connected to two others, making triangles. Building one triangle against the next, and the next, and the next allows the whole space under consideration to be covered.

I'm nowhere near the first here, though, as a quick search established:
In the social sciences, triangle is often used to indicate that two (or more) methods are used in a study in order to check the results of one and the same subject ... By combining multiple observers, theories, methods, and empirical materials, researchers can hope to overcome the weakness or intrinsic biases and the problems that come from single method, single-observer and single-theory studies.
So what's my point? Testing frequently involves collations, connections, and comparisons. Triangulation is an interesting model of those activities to consider, for me, right now, even there's likely no solar system in which it's a novel one.

Thursday, August 3, 2017

On Mapping Non-testable Papers

The Test team book club at Linguamatics read On Testing Non-testable Programs by Elaine Wyuker this week. As usual, the discussion was interesting and, as usual, the reading material was only the starting point and, as usual, we found ourselves exploring our own context and sharing our own experiences and ideas.

I find this kind of interaction invigorating and energising. It remains fascinating to me that we each bring common and unique perspectives to these meetings and I thrive on hearing others on my team talk about how they see the topic, I covet the time I spend thinking about how I do, and then I enjoy immensely contrasting the two.

I had wondered, while reading the paper, whether I could extract some kind of ontology of oracles from it. Informally, it seemed that Weyuker had structured her analysis in this way: programs are testable or not; these are characteristics of non-testable programs; non-testable programs are of three types; these approaches to oracles can be used with such and such a type, ...

So, to explore that notion, I went back to the paper and gave myself an hour to re-read it, watch a short video by Cem Kaner which references it, and make a mind map. (I'm also interested in experimenting with getting value from mind maps.)

Unsurprisingly, you might say, when studying the paper more closely, with a particular purpose in mind, I found that my informal analysis was a simplistic analysis. In order to map the details of the structure I was interested in, I had to think harder than when I was merely absorbing the higher-level points.

To give one example, Weyuker breaks non-testable programs into three types early in the paper:
It is interesting to attempt to identify classes of programs which are non-testable. These include: (1) programs which were written to determine the answer. If the correct answer were known, there would have been no need to write the program; (2) programs which produce so much output that it is impractical to verify all of it; (3) programs for which the tester has a misconception.
Then, later, in a section about approaches for dealing with two of those types, she says
For those programs deemed non-testable due to a lack of knowledge of the correct answer ... In the case of programs which produce excessive amount of output ... One other class of non-testable programs deserves mention. These are programs for which not only an oracle is lacking, but it is not even possible to determine the plausibility of the output.
Is this third class the same as the original third class? Or different? (And, as a writer, what can I learn from that?)

Here's the map I produced in my timebox:


Friday, July 28, 2017

An Idea Please, Bob

Conceptual Blockbusting by James L. Adams is subtitled A Guide to Better Ideas. It tackles the problem of lack of creativity by suggesting and categorising blockers and then proposing ways around them. I reviewed the book recently, and was left with bunch of quotes I enjoyed but didn't have space for. Here they are:

If the problem is not properly isolated it will not be properly solved. (p. 23)

In [Stream Analysis, Jerry Porras] claims that people, especially people in organizations, tend to work on getting rid of symptoms, rather than solving the real problems ... Not surprising, since core problems are more difficult to solve and their solution often creates greater controversy. But perhaps not what we would like to think. (p. 24)

The fear of making a mistake is,  of course, rooted in insecurity, which most people suffer from to some extent. Such insecurities are also responsible for [an] emotional block, the "Inability to tolerate ambiguity; overriding desire for order; 'no appetite for chaos.'" ... The solution of a complex problem is a messy process. Rigorous and logical techniques are often necessary, but not sufficient. (p. 48)

The "Preference for judging ideas, rather than generating them" is also the "safe" way to go. Judgement, criticism, tough-mindedness, and practicality are of course essential in problem-solving. However, if applied too early to too indiscriminately in the problem-solving process, they are extremely detrimental to conceptualization. (p. 48)

Another cultural block [is] "Problem-solving is a serious business and humor is our of place." [But] Arthur Koestler ... explained comic inspiration, for example, as stemming from "the interaction of two mutually exclusive associative contexts." As in creative artistic and scientific acts, two ideas have to be brought together that are not ordinarily combined. This is one of the essentials of creative thinking. (p. 61)

No matter how talented the problem-solver, frustration and detail work are inescapable in problem-solving. (p. 69)

Often the degree of difficulty induced by [the block of inadequate language skill, and reluctance to draw to help overcome it] is not even appreciated, since the describer knows exactly what he is trying to describe, and the the describee often naturally assumes that she understands exactly what the other person is describing. (p. 92)

List-making is one of the simplest, most direct methods of increasing your conceptual ability. (p. 133)

Many creativity "techniques" have to do with breaking our mental set — diverting us from accepting the answer that first occurs to us by making us develop and consider others. (p. 134)

If you cannot solve the proposed problem, try to solve first some related problem. Could you imagine a more accessible related problem? A more general problem? A more special problem? An analogous problem? Could you solve a part of the problem? (p. 142)

If you want people to think creatively [their time] should not be scheduled down to the minute. Environmental change must be tailored to the desired goal and level of creativity. (p. 146)

Some of the more important conceptual blocks that apply to groups are: ... poor leadership, ... lack of proper support. (p. 159)

Affiliation needs underlie many of the conceptual blocks discussed [in this book]. People will like you if you think the way they do. But to the extent you succeed in aligning your thoughts with those of others, you can add to your perceptual and intellectual blocks. (p. 166)

A major organizational problem is to properly balance creativity and control. (p. 176)
Images: BBC, Amazon

Sunday, July 23, 2017

Ignorance, Recognised

Stop Working & Start Thinking is intended to help postgraduate students make profitable use of an essential piece of scientific equipment: their mind. I'm only a short way in, and finding it a bit dense at times, but there's already a few passages I'm loving. Here's one (page 15):
Science asks questions, and it has a small variety of ways to look for answers. They are observation, measurement, investigation and experiment. Different kinds of problem need different approaches for their solution and one of the ways the experienced scientist knows which to use is that she or he has got it wrong many times in the past! This cannot be said too often or emphasised too much. Ignorance, recognised, is the most valuable starting place; all scientists should have many stories about where they were sure, and wrong; where they were ignorant but did not know it.
Image: Goodreads

Thursday, July 20, 2017

In Two Minds

Jerry Weinberg's definition of quality is well known. It is generally applied to encapsulate a relationship between a person and a product, at a particular time, and goes like this:
Quality is value to some person
It is intended to be a practical tool, and I think Weinberg would agree with something like this as a gloss for it: theoretical assessments of quality — perceived quality — are less important than those which are motivated by action. For example, a property is worth what someone actually pays for it. Without the action, it's just philosophy. What someone is willing to pay, or sacrifice, determines the quality (to them) at that moment.

I've thought about this definition a lot over the years. In particular I've found myself speculating about the granularity of the definition. Back in 2012 I was wondering whether it was interesting to consider quality in terms of the aggregation of a set of qualitiesmore recently I was thinking about the way that product quality and its effect on quality of life might be interesting; and just now I've been worrying away at the possibility of holding conflicting views of the quality of a product at the same time.

Although there are numerous examples in Weinberg's work of multiple people with differing opinions of a product at once, I haven't found any where a single person has that. Here's one relevant extract, from Quality Software Management volume 1, page 5:
For different persons, the same product will generally have different "quality," as in the case of my niece's word processor. My [complaint about a bug not being fixed] is resolved once I recognize that to [my niece], the people involved were her readers; and to [the word processor developer], the people involved were the majority of his customers.
One of the things I find intriguing is that the definition, and its common usage, together seem to suggest that a person is only able to assert or demonstrate or alter their assessment of the quality of a product at certain times: when they pay for the product, and when they are making use of the product. So I tried some thought experiments.

In the first, I imagined that I might be in a position to buy a Bentley:
  • I believe that the build standards of a Bentley are much higher than cheaper cars. I will pay more for higher build standard. This is a measure of value. As quality is value, I think the car is high quality.
  • I believe I would get no extra benefit from a Bentley over some other cheaper car, given how I use my car. So I won't pay a high price for a Bentley. This is a measure of value. As quality is value, I think the car is low quality.

And then I reflected:
  • I feel that I can hold these kinds of opposing views at the same time without problem (at least in some cases).
  • I speculate that quality can be a relationship between a person and product-attribute rather than a product.
  • My examples are couched in terms of belief rather than actual knowledge (I have never even sat in a Bentley)
  • ... so to Weinberg I've really I've got a statement about perceived value, if anything, here.
  • Perhaps related, quality assessment of wants (I'd love a Bentley) could be different to needs (I have to have some personal transport).
  • Is there always, ultimately, some overriding single attribute of quality that wins out for any given person, at a given time, and so multiple perceived qualities collapse at the point of use into a single assessment?
  • Or perhaps simultaneity is a false perception here. Maybe I am switching between views — very rapidly — and only hold one at any given time.
  • Another angle: when I consider two contexts of use, or aspects, or applications of a product, could I really be considering effectively two different products?

I tried another scenario, which attempts to take belief and perception out of the equation by using a more mundane product that I have personal experience of. Let's say I have bought a new pen and I want to use it for two tasks: taking notes while standing up and taking notes while suspended by my feet.
  • The pen is suitable for the first task and I am very happy with the price I paid. I say this is a good quality pen.
  • The pen is not suitable for the second task. I had to keep inverting it to let ink run back to the nib end, which I am unwilling to do any longer. I say this is a low quality pen.

More thoughts:
  • At the point where I pay for my pen, by Weinberg's model, I make an explicit statement about the quality of the pen for me.
  • Unintuitively, perhaps, if I've never used such a pen before this is based only on my perception of the value the pen will return to me
  • ... so perceived quality can turn into actual quality with no additional evidence to back it up
  • ... and, on engagement with the pen, I might rapidly revise my opinion.
  • Once I've paid for it, I express my view of the quality of the pen by the extent to which I am prepared to sacrifice to use it
  • ... and (if I understand the model) effectively the only time at which I can express this view is at the point of use
  • ... because at other times I merely express a perception of what I would do when I came to use it
  • ... and so can I change my expression of the quality of the pen without using it?
  • For example, can I express an opinion on quality by choosing not to use something?
  • ... but then how to distinguish between something that I happen not to use and something that I actively don't use, and something that I use only occasionally but is perfect for a particular task?

And then I stopped and dumped my notes here, after pondering how much I was prepared to sacrifice to continue this particular line of thought at this time.

With thanks to Jerry for patiently listening to me trying to make some kind of argument along these lines in email, and then patiently declining to agree. And also to Šime for prompting more thoughts when I was going round in circles.

Edit: Simon Morley followed up his comments on this post with  Quality-Value: Heuristic or Oracle?

Wednesday, June 28, 2017

Cambridge Lean Coffee

This month's Lean Coffee was hosted by us at Linguamatics. Here's some brief, aggregated comments and questions  on topics covered by the group I was in.

If we don't do testing, what do we replace it with?

  • We move test environment and tooling into Dev.
  • But practically, how do you ensure the customer gets the right thing?
  • Testing vs checking: testers need to exist.
  • Perhaps the tester just becomes an advisor?
  • With more ability to push into production more often and roll back if there's a problem, there can be less testing.
  • Even if testing is done elsewhere (by developers or customers) we still need someone to ask pertinent questions about the product, to evaluate risks.
  • And where is the test manager?
  • The test manager is taking a more strategic view, coaching, keeping people aligned, across products and projects.
  • Testing is being pushed left (into Dev) and pushed right (into production) and up (into the business).
  • Then what would be down?
  • Why do we need test managers? Why not just engineering managers?
  • Managers with relevant technical skills are respected by staff.

Formal test plans. How can they help in coordinating phases or levels of development?

  • A fair analogy to the questioner's context might be coordination between teams building layers of unit tests, integration tests, end-to-end tests.
  • How do you get the right amount of co-ordination between the different phases?
  • How can you compare the coverage in each phase?
  • Can a formal test plan (whatever that is) be a way to begin to share what's done in each phase?
  • ... and make it consistent across releases?
  • Talk to people!
  • Consider checklists over something heavyweight.
  • What problem are you trying to solve here?
  • A perception that there is repeated work in phases, and that this impedes delivery to market.
  • Are you concerned that there might be testing that no-one is doing?
  • Oops.
  • Can all phases work in one environment so that more can be shared?
  • Is there a way to instrument environments to tell what of the product functionality is being exercised in each environment?
  • ... perhaps something like code coverage metrics in software?

What are you reading? Has it helped you? How?

  • Conceptual Blockbusting by James L Adams.
  • ... it identifies ways in which creativity is blocked
  • ... and it suggests techniques for overcoming them
  • ... which I find valuable as I view testing as inherently creative.
  • Podcasts more than reading at the moment
  • ... because I can do them at the same time as something else
  • ... topics include culture, science, testing, mental health, sports
  • ... and I find I can draw parallels to my work.
  • Coaching Agile Teams by Lyssa Adkins
  • ... right now, I'm re-reading the section on conflict resolution
  • ... because it's relevant to my work situation.