Are these evidence labels a confidence score or a ranking?

No. They are descriptive context tags that say what kind of evidence stands behind a claim and how broadly the behaviour has been observed. They do not assign a number, a grade, a star rating, or a confidence percentage, and they do not rank animals or studies against each other. The whole purpose is to avoid that kind of single-scale ranking, which is exactly the error these pages warn against. A label tells you where a finding comes from so you can judge it; it does not do the judging for you.

Why distinguish captive-study from wild-study at all?

Because where a behaviour is observed changes what the observation means. Captive animals live with different space, diet, social grouping, and stress than free-living ones, so a behaviour seen in a zoo, aquarium, or lab describes captive animals unless wild observation confirms it carries over. Wild-study captures natural context but is harder to control and replicate. Keeping the two labels separate stops a captive finding from being read, by accident, as a statement about how the animal behaves in the wild — a very common overgeneralisation.

Does the 'debated' label mean the claim is false?

No. 'Debated' marks a behaviour where careful researchers genuinely disagree about how to interpret what is seen, or where a simpler explanation has not yet been ruled out. It flags an open, active question, not a debunked or fringe idea. The appropriate response is to treat the claim with curiosity and to expect that a single confident summary would be overstating it, rather than to dismiss it outright.

Why use a 'broad-group' label instead of just naming species?

Because some statements are genuinely about a large group — a family, an order, or 'most' of a kind of animal — rather than a documented finding in particular named species. The broad-group label is honest about that breadth: it warns the reader not to read a general pattern as precise, per-species evidence. Where FaunaHub can point to a specific study in named animals, it uses the field, captive, wild, controlled, or mixed labels instead, which carry more specific information.

Research methods & source literacy

Evidence context in animal behavior

When you read that an animal does something — caches food, uses a tool, recognises a mark in a mirror, cooperates with neighbours — the claim is only as meaningful as the evidence behind it. A behaviour seen once in a single zoo animal is not the same kind of claim as a pattern documented across many wild populations, even when both end up phrased as a confident sentence. FaunaHub attaches a short evidence-context label to behaviour claims so that the kind of evidence travels with the claim instead of being quietly dropped.

These labels are descriptive, not a scoring system. They tell you where a finding comes from and how broadly it has been observed — field or lab, captive or wild, one setting or several, settled or still argued over. They do not rate how 'smart' an animal is, rank species, or assign a confidence percentage. The point is the opposite of a ranking: to slow down overgeneralisation and keep each claim tied to the conditions under which it was actually observed.

A plain-language guide to FaunaHub's seven descriptive evidence-context labels, explaining what kind of study or observation sits behind a behavior claim and why these are context tags, not confidence scores.

Key concepts

Evidence context (not confidence score): A short label describing the type and breadth of evidence behind a behaviour claim. It answers 'what kind of study or observation is this?' rather than 'how certain are we, on a scale?'. There is no number, grade, or star rating attached.
Field vs captive vs wild: Where the behaviour was observed matters. Field and wild observations capture natural context but are harder to control; captive and lab work allows control but can shape behaviour. The labels keep that distinction visible instead of blurring it.
Mixed-evidence: Used when a claim rests on more than one kind of source — for example, controlled tasks plus wild observation that point the same way. It signals converging evidence, not that the matter is confused.
Debated: Applied where careful researchers genuinely disagree about how to interpret a behaviour, or whether a simpler explanation accounts for it. It flags an open question, not a debunked claim or a fringe idea.
Broad-group pattern: Marks a general statement about a large group (a family, order, or 'most mammals') rather than a documented finding in named species. It warns the reader not to read it as precise, per-species evidence.

Why a label travels with each claim

A behaviour claim and the evidence behind it are easy to separate, and once separated the evidence is usually the part that gets lost. 'Octopuses use tools' is a tidy sentence whether it rests on one aquarium anecdote or a body of careful observation, and a reader has no way to tell the difference from the sentence alone. The evidence-context label is FaunaHub's way of keeping the second piece of information attached to the first.

The labels are deliberately plain and descriptive. They name the kind of evidence — natural observation, a controlled task, a captive setting, a wild setting, several sources together, an unsettled question, or a statement about a whole group — without translating any of that into a number. This matters because a number invites exactly the comparison these pages try to avoid: it would tempt readers to rank claims, and then animals, on a single scale, which is the error comparative cognition has spent decades moving away from.

Used well, a label is a prompt rather than a verdict. It nudges the reader to ask the question a careful ethologist would ask — observed where, under what conditions, and in how many animals — and to hold the claim a little more loosely or a little more firmly depending on the answer.

The seven labels, in plain terms

Field-observation means the behaviour was watched in a natural or near-natural setting, without the researcher controlling the conditions. Its strength is realism: the animal is doing what it does in its own world. Its limit is that uncontrolled settings make it harder to rule out simpler explanations or coincidence. Controlled-study means the behaviour was tested under arranged conditions designed to isolate one factor and exclude alternatives — for instance, a task built so that subtle cues from a handler cannot explain the result. Its strength is rigour; its limit is that an arranged task may not reflect how the animal behaves outside it.

Captive-study marks evidence gathered from animals in zoos, aquariums, sanctuaries, or laboratories. Such work can be valuable and is sometimes the only way to observe a behaviour closely, but captivity alters space, diet, social grouping, and stress, so a captive finding describes captive animals unless wild observation confirms it. Wild-study marks evidence from free-living animals in their natural range, which captures natural context but is harder to standardise and replicate. The two are complementary: each fills gaps the other leaves.

Mixed-evidence is used when a claim draws on more than one of these — say, a controlled task and independent wild observation that agree. It signals converging lines of evidence, which is generally a stronger footing than any single source. Debated flags a behaviour where careful researchers still disagree about interpretation, or where a simpler explanation has not been ruled out; it marks a live question, not a discredited one. Broad-group pattern marks a general statement about a large group — a family, an order, or 'most' of a type of animal — rather than a documented finding in named species, and it warns the reader not to treat the general as if it were precise.

How to read the labels well

The most important habit is to resist turning the labels into a ladder. Controlled-study is not 'better' than field-observation, and wild-study is not 'better' than captive-study; they answer different questions and carry different limits. A controlled task can prove an animal is capable of something under ideal conditions while telling you little about whether it does so in the wild, and a field observation can show what an animal really does while leaving the underlying mechanism open. The honest reading weighs the label against the claim being made, not against the other labels.

It also helps to match the label to the strength of the wording. A wild-study or mixed-evidence label sits comfortably under a confident, general statement; a captive-study or field-observation label asks for a more cautious one, often about specific populations rather than a whole species. When a claim is tagged debated, the right response is curiosity rather than dismissal: it is a place where the science is still being worked out, and where any single tidy summary would be overstating the case. A broad-group label, finally, is a reminder that the sentence is painting with a wide brush, and that the picture for any one named animal may differ.

None of this requires specialist knowledge — only the small, repeatable question the labels are built to provoke: observed how, where, and how widely? A reader who asks that of each claim is already reading animal-behaviour writing the way researchers wish it were read.

Why this matters for reading behavior claims

The single biggest error in popular animal-behaviour writing is overgeneralisation — taking one captive individual, or one clever study, and stretching it into a claim about a whole species or 'all' of a group. An evidence-context label makes that stretch visible at a glance, so a reader can tell a documented field pattern from a one-off captive observation before deciding how much weight to give it.

Keeping the evidence attached to the claim is also a matter of honesty. It lets FaunaHub describe genuinely striking behaviour without overclaiming, and it lets readers build the habit of asking 'observed how, where, and in how many animals?' — the same question working ethologists ask of each other.

Common mistakes this helps you avoid

Treating the labels as a quality or confidence ranking — assuming 'controlled-study' beats 'field-observation', when in fact each method has different strengths and limits and neither is automatically stronger.
Reading a 'captive-study' claim as a description of wild behaviour. Captivity changes space, diet, company, and stress, so a captive finding describes captive animals unless wild observation confirms it.
Taking a 'broad-group' statement as precise, per-species fact — for example reading 'many corvids cache food' as proof that one named bird does, in the way a documented study would show.
Seeing 'debated' as a synonym for 'false' or 'discredited'. It marks an open, live disagreement among careful researchers, not a settled negative verdict.
Assuming a 'mixed-evidence' label means the evidence conflicts. It usually means several kinds of evidence converge on the same picture, which is generally a stronger position than any one source alone.

What this page does not establish

These labels describe the kind and breadth of evidence behind a claim; they are not confidence scores, certainty ratings, or quality grades, and they do not rank species or settle scientific debates. A label tells you how a behaviour was studied and how widely it has been observed — it cannot tell you, on its own, that a claim is definitely true, that it applies to every individual, or that the underlying interpretation is beyond dispute. FaunaHub assigns labels editorially from how the relevant literature is generally characterised; they are a reading aid, not a formal measurement or a substitute for the primary research.

See these ideas in our behavior profiles

How FaunaHub uses sources

These methodology notes sit alongside FaunaHub's wider source practice. See animal research sources and how FaunaHub uses sources, and return to the animal intelligence & behavior hub.

Frequently asked questions

Are these evidence labels a confidence score or a ranking?: No. They are descriptive context tags that say what kind of evidence stands behind a claim and how broadly the behaviour has been observed. They do not assign a number, a grade, a star rating, or a confidence percentage, and they do not rank animals or studies against each other. The whole purpose is to avoid that kind of single-scale ranking, which is exactly the error these pages warn against. A label tells you where a finding comes from so you can judge it; it does not do the judging for you.
Why distinguish captive-study from wild-study at all?: Because where a behaviour is observed changes what the observation means. Captive animals live with different space, diet, social grouping, and stress than free-living ones, so a behaviour seen in a zoo, aquarium, or lab describes captive animals unless wild observation confirms it carries over. Wild-study captures natural context but is harder to control and replicate. Keeping the two labels separate stops a captive finding from being read, by accident, as a statement about how the animal behaves in the wild — a very common overgeneralisation.
Does the 'debated' label mean the claim is false?: No. 'Debated' marks a behaviour where careful researchers genuinely disagree about how to interpret what is seen, or where a simpler explanation has not yet been ruled out. It flags an open, active question, not a debunked or fringe idea. The appropriate response is to treat the claim with curiosity and to expect that a single confident summary would be overstating it, rather than to dismiss it outright.
Why use a 'broad-group' label instead of just naming species?: Because some statements are genuinely about a large group — a family, an order, or 'most' of a kind of animal — rather than a documented finding in particular named species. The broad-group label is honest about that breadth: it warns the reader not to read a general pattern as precise, per-species evidence. Where FaunaHub can point to a specific study in named animals, it uses the field, captive, wild, controlled, or mixed labels instead, which carry more specific information.

Last updated: 2026-06-28