When to use each pipeline

The tables below provide an overview of each analysis pipeline and offer guidance on when and how to use them effectively.

Overview 

Thematic AnalysisTranscript AnalysisKey Term ExtractionCodeframe Classification
PurposeDiscover the themes and topics hidden in open-ended responsesAnalyse long-form content such as interview transcripts, focus-group discussions, and multi-paragraph feedbackExtract and count the specific items (brands, products, features) that respondents mentionClassify responses against a predefined set of codes/categories provided by the user
Best forLong, descriptive responses – sentences and paragraphsVery long responses – multi-sentence and multi-paragraph (interviews, transcripts, focus groups)Short responses – single words, brand names, product mentionsAny response length – when you already know the categories you want to code against
What it producesA two-level hierarchy of parent themes and sub-themesA two-level hierarchy of parent themes and sub-themesA frequency-ranked list of normalised entitiesA flat frequency distribution of responses across your predefined codes
How it classifiesGroups responses by meaning – responses about similar topics end up in the same themeSplits each response into overlapping sentence windows, clusters the chunks by meaning, then aggregates back to document levelReads each response and pulls out every named item mentionedMatches each response to the most relevant code(s) from your uploaded codeframe
Multi-label supportYes – primary theme plus secondary codesYes – primary theme plus secondary codesYes – multiple entities per responseYes – multi-label is the default; most responses get multiple codes
Who defines the categories?The system discovers them from the dataThe system discovers them from the data (same as Thematic but optimised for long text)The data itself – entities are extracted as-isYou do – you upload a codebook of themes and descriptions
HierarchyYes – parent themes and sub-themesYes – parent themes and sub-themesFlat listFlat list (no sub-themes)

Scenario Comparisons

ScenarioThematic AnalysisTranscript AnalysisKey Term ExtractionCodeframe Classification
Responses are sentences or paragraphsBestBest Good
Responses are single words or short phrases BestGood
You want to understand the topics people are talking aboutBestNot designed for thisPartial – only finds topics in your codeframe
You want to count how many times each brand/product was mentionedNot designed for thisNot designed for thisBestPossible if your codeframe lists the brands
You want a parent/sub-theme hierarchyYesYes No – flat structure only
You already have a codebook and need responses coded against itNot designed for thisNot designed for thisNot designed for thisBest
You need consistent, repeatable coding categories across wavesThemes may vary slightly between runsThemes may vary slightly between runsGood – entities come from the dataBest – categories are locked to your codeframe
Grouped columns (e.g. brand_1, brand_2, brand_3)Yes – with mention rank filteringYes – with mention rank filteringYes – with mention rank filteringYes – with mention rank filtering
Fewer than 50 responsesNot enough data to find reliable patternsNot enough data to find reliable patternsWorks well – even small datasets produce useful frequency countsWorks well – predefined codes don’t need large samples to apply
You need to compare results across waves or marketsGood – but themes may vary slightly between runsGood – but themes may vary slightly between runsGood – entities are consistent because they come from the data itselfBest – same codeframe guarantees identical categories every time
Scroll to Top