play icon for videos

SMART metrics: definition, framework, and examples

SMART metrics test whether a number is defensible. Definitions, framework, six design principles, worked example, and FAQ for program teams.

US
Pioneering the best AI-native application & portfolio intelligence platform
Updated
May 4, 2026
360 feedback training evaluation
Use Case
SMART metrics

A metric reports a number. A SMART metric defends that number. Most fail before the meeting ends.

SMART stands for Specific, Measurable, Achievable, Relevant, and Time-bound. Each letter is a separate test the metric has to pass. This guide explains what each test catches, why most metrics fail at least one, and how to recognize a defensible measurement before it lands in a board deck or a funder report. Worked examples come from a workforce training program.

The framework

Five tests. Different failures. Drop one and a different failure walks through.

SMART is taught as a single acronym, but it is really five separate filters. Each letter catches a different kind of metric error. The pathway below shows the order most teams apply them in. The band underneath names the failure each letter catches when it is the one missing.

The pathway from claim to defensible metric
S
Specific
Names what is counted, who it covers, and which condition must hold.
M
Measurable
States the unit and the data source. The number has to come from somewhere.
A
Achievable
Anchors the target to a baseline. Not a wish, not a stretch with no floor.
R
Relevant
Ties the metric to the outcome the program is meant to produce.
T
Time-bound
Names the window. Without a window, the count never reports.
What each letter catches
Vague outcome word like impact, growth, or engagement.
Opinion that no instrument can produce as a number.
Target set without comparing to a prior baseline.
Metric that does not match the program theory.
Measurement that runs forever and never reports.

Because the five letters catch different failures, the framework only works when all five tests are applied. A metric that satisfies four of five is the metric that breaks in front of the board.

Pathway adapted from Doran, G. T. (1981). There’s a S.M.A.R.T. way to write management’s goals and objectives. Management Review, 70(11). The failure-mode layer is added by Sopact.
Definitions

SMART metrics, defined

Four definitions program teams ask for in this order: the meaning, the formal definition, the working list of what counts as a SMART metric, and the test for whether a single metric is measurable. Each answer is short on purpose.

SMART metrics meaning

SMART metrics are measurements that pass five tests at once: Specific, Measurable, Achievable, Relevant, and Time-bound. The acronym was introduced by George Doran in 1981 as a way to write management goals that survive the next quarterly review. The same five tests now apply to performance indicators, OKRs, KPIs, and impact measurement.

The meaning that matters in practice: a SMART metric is a metric a team can defend. The number can be traced back to a source row of data and forward to a decision the program needs to make.

SMART metrics definition

Formally, a SMART metric is a quantitative measurement that satisfies all five SMART criteria simultaneously. Specific: the metric names a precise condition, population, and unit. Measurable: the metric has a defined instrument or system that produces the count. Achievable: the metric is anchored to a baseline and a realistic target. Relevant: the metric matches the program theory or business outcome it is meant to inform. Time-bound: the metric specifies the window over which the count applies.

Some practitioners extend the acronym to SMARTER (adding Evaluated and Reviewed) to emphasize the periodic recheck. The five-letter version remains the working definition in most monitoring and evaluation guidance and in most KPI literature.

What are SMART metrics in practice?

In practice, SMART metrics are the small set of numbers a program team chooses to defend each quarter. They are the metrics that show up in the board deck, the funder report, and the planning meeting that decides what the program does next. Every other measurement is either a diagnostic signal or a vanity metric.

A workforce training program might run forty data points across its intake, mid-cycle, and exit surveys. Five of those become its SMART metrics: cohort completion rate, training-matched placement rate at six months, employer satisfaction with cohort hires, wage gain against baseline, and twelve-month retention. The other thirty-five are the diagnostic context that explains the five.

What is a measurable metric?

A measurable metric is one where the M test passes. The metric names a unit (people, dollars, days, sessions, placements) and a source (the survey question, the program record, the payroll system) the count comes from. Without both, the metric is an opinion dressed in number-shaped clothing.

The most common M failure is a metric written as a verb: "improve engagement," "grow capacity," "strengthen pipeline." None of these name a unit or a source. Rewriting them as countable nouns with named instruments is what passing the M test looks like.

You may also see this called something else

Vocabulary varies across communities. KPI consultants tend to use SMART performance indicators, SMART performance metrics, and SMART performance measures. Monitoring and evaluation teams say SMART measure, SMART measures, or SMART measurement. OKR coaches talk about actionable metrics. Data analysts use SMART methodology in data analysis to describe a definitional pre-flight check before computation.

Whether the page in front of you discusses the SMART metrics framework, measurable metrics, or what SMART in metrics or SMART in setting metrics actually means in practice, the underlying definition is the five tests above. Measurable metrics meaning has not changed since 1981: a unit, a source, a baseline, a relevance to the program theory, and a window over which the count applies.

Related, but different

Metric

A measurement of any kind. Metrics may or may not pass the SMART tests. Most do not.

KPI

A key performance indicator. A metric that has been promoted to the short list a team reviews. SMART criteria for KPIs apply the same five tests at the indicator level.

Goal

The outcome the metric is meant to track toward. SMART goals were the original 1981 application; the criteria moved to metrics and KPIs as the framework spread.

Indicator

A measurement chosen because it stands in for a harder-to-observe outcome. SMART indicators apply the five tests with extra attention to the Relevant test (does the indicator track the right thing).

Design principles

Six principles that decide whether a SMART metric survives the next quarter

The five-letter test is the filter. These are the design choices teams make while drafting a metric so it actually passes the filter on the first try, not on the third rewrite during board prep week.

01 · SPECIFICITY

Name the population, not the program

Whose change is the metric measuring?

Metrics fail the Specific test when they describe what the program does instead of who the program changes. "Workshops delivered" is a program activity; "graduates placed in matched roles" is a population outcome. Name the population first.

Why it matters. The population framing forces a denominator, which is what makes the metric comparable across cohorts.

02 · MEASURABILITY

Pick the unit before you pick the target

What gets counted, by what instrument?

Targets set before units are how vanity numbers happen. Decide what unit the metric counts in (people, dollars, days, percentage points against a baseline) and which system or survey produces the count, then set the target.

Why it matters. The unit constrains the target to something defensible instead of aspirational.

03 · ACHIEVABILITY

Anchor every target to a baseline

What did the prior cohort, prior year, or prior intervention produce?

A target without a baseline is a wish. A baseline without a target is a report. Both fail the Achievable test. The pattern that works names the prior number, the new target, and the absolute or relative gain expected.

Why it matters. Baseline-anchored targets are the only kind a board can challenge constructively.

04 · RELEVANCE

Tie the metric back to the program theory

If this metric moves, does the program theory move with it?

Relevance is the test most often skipped because it requires knowing what the program is supposed to produce. A metric that does not connect to a step in the theory of change is a metric that survives quarterly reporting but cannot inform a single program decision.

Why it matters. Relevance is what separates SMART metrics from busywork metrics.

05 · TIME WINDOW

Name the window or the metric never reports

When does the count start and stop?

"Improve placements" runs forever and reports never. "Placements within six months of program completion, measured at the end of each quarter" is countable, finishable, and comparable. Time-bound is the test that forces the metric to be something the team can finish.

Why it matters. Without a window, the metric is permanently in progress and never lands a decision.

06 · SHORT LIST

Pick five metrics, not fifty

Which numbers are the team actually willing to defend?

Programs collect dozens of data points. Only a handful become SMART metrics. The rest are diagnostic context. Trying to make every data point pass the five tests is how dashboards become unreadable. Pick the small set the team will defend, and make those defensible.

Why it matters. A short list of defensible metrics outperforms a long list of suggestive ones every quarter.

Choices

Six choices that decide whether your metrics survive the first board review

The first decision in metric design controls all the others. Most teams spend their reporting effort fighting downstream consequences of an upstream choice they did not realize they were making. The matrix below names each choice and what the broken and working versions look like.

The choice
Broken way
Working way
What this decides
Picking the metric subject
Whose change is being counted?
BROKEN
Subject is the program. "Workshops delivered." Counts activity, not change.
WORKING
Subject is the population. "Graduates placed in matched roles." Counts change in the people the program is meant to serve.
Whether the metric can describe outcomes or only describe activity.
Defining the unit
What is the count counting in?
BROKEN
Unit is implicit or aspirational. "Improve engagement." No instrument can produce a number.
WORKING
Unit is explicit. "Mid-program retention rate, percentage of cohort attending session 6 of 8, source: attendance roster."
Whether the metric can be computed at all when the data lands.
Setting the target
What does success look like in numbers?
BROKEN
Target with no baseline. "Hit eighty percent." A round number written in a planning meeting.
WORKING
Target anchored to a baseline. "Eighty percent placement, against the prior cohort baseline of sixty-three. A seventeen-point gain."
Whether the target is defensible or whether the next planning meeting reopens it.
Linking metric to program theory
If this number moves, does the program theory move with it?
BROKEN
Metric chosen because it is countable. "Email open rate." Tracks engagement signal, not the outcome the program is meant to produce.
WORKING
Metric chosen from the theory of change. "Wage gain at six months." Maps to the outcome step the program theory is built around.
Whether the metric can inform program decisions or only fill a quarterly slide.
Setting the time window
When does the count start and stop?
BROKEN
No window stated. "Increase placements." Runs forever. Never reports a final number.
WORKING
Window stated. "Placements within six months of program completion, measured at the end of each quarter." Has a start, an end, and a comparison cycle.
Whether the metric ever finishes a reporting cycle or stays in progress for years.
Sizing the metric set
How many SMART metrics is the right number?
BROKEN
Every data point promoted to KPI. Forty metrics in the deck. None of them defended deeply.
WORKING
Five to seven SMART metrics. The rest kept as diagnostic context. Each defended one row deep when challenged.
Whether the team can defend any metric when a reviewer pushes on the source.
Worked example

A workforce training program rewrites one metric through the five tests

The program runs an eight-week training cohort with about sixty graduates per cycle. The board wants one number that answers whether the program is working. The team has a draft. It does not survive any of the five tests. Here is the rewrite.

We started the year reporting "improved job outcomes" because that is what the funder asked for. The board asked what we meant by improved. We said placements were up. They asked compared to what. We said this cohort versus last cohort. They asked how we counted placement, and whether self-reports and verified hires were the same number, and what window we were measuring. We did not have answers. So we sat down and rewrote the metric until every question had a one-line answer.

Workforce training program lead, end of first board cycle

Quantitative axis

The number

47 of 60 graduates placed in a training-matched role within six months of program completion, against a prior-cohort baseline of 38 of 60.

Qualitative axis

The meaning behind it

Open-ended responses from the same graduates explain which placements were career changes, which were lateral moves, and which the graduates themselves credit to the training versus prior experience.

What Sopact Sense produces

  • Specificity, by named field
    "Training-matched role" is a stored attribute on the placement record, not a phrase the team interprets each quarter.
  • Measurability, by tracked instrument
    The placement count comes from the six-month follow-up survey linked to the graduate’s stakeholder ID. One source, one unit.
  • Achievability, by linked baseline
    Prior-cohort placement rate is queryable in the same workspace. The baseline is not pasted from a separate spreadsheet.
  • Relevance, by program theory tag
    The metric is tagged to the outcome step in the program theory captured at intake design. Anyone can see why the metric was chosen.

Why traditional tools fail

  • Specificity drift across exports
    Each quarterly export of the placement question reinterprets "matched role" because the field is free text in the form tool.
  • Measurability lost between systems
    Survey responses live in one tool, the placement tracker in another, and a stakeholder ID is missing in both. Counts disagree.
  • Achievability without a baseline
    Prior-cohort data sits in an old export the analyst cannot find. Targets get set against round numbers, not prior performance.
  • Relevance forgotten by quarter three
    By the third report cycle, no one remembers why "matched role" was chosen over "any placement." The metric stops informing decisions.
Where SMART metrics show up

Three program shapes, three different SMART metric problems

The five-letter test is the same. What changes by program shape is which letter breaks first. These three contexts show the failure modes most common in each, and the working pattern that holds up in the next review.

01

Workforce training programs

Cohort cycles. Intake, mid-cycle, exit, six-month follow-up. Funder reporting on placement and wage outcomes.

The typical shape is a sixty-person cohort with four collection touchpoints. Intake captures baseline employment status. Mid-cycle captures attendance and engagement. Exit captures completion and a first placement signal. Six-month follow-up captures the placement that matters for outcomes reporting.

What breaks first is the M test. The placement tracker lives in a spreadsheet, the surveys live in a forms tool, and the IDs do not match. Quarterly reporting becomes a manual matching exercise where the analyst spends three days matching records before a single metric gets computed. By the time numbers reach the board, half the definitions have shifted.

What works is anchoring every measurement to a stakeholder ID created at intake. The placement tracker, the four survey instruments, and the cohort baseline all reference the same ID. Each SMART metric (placement rate, wage gain, retention) traces to a single source row. The board cycle becomes a review of numbers, not a review of how the numbers were assembled.

A specific shape

A workforce program reported placement rates ranging from forty to seventy percent across cohorts depending on the quarter. The range disappeared once the metric moved from "placements" to "training-matched placements within six months, source: follow-up survey." The variation was definitional, not actual.

02

Education and youth development

Multi-year cohorts. Long time horizons. Outcomes that depend on context the program does not control.

The typical shape is a four-year program with annual touchpoints and a five-year post-program follow-up. Indicators range from short-term (skill gain) to long-term (postsecondary enrollment, persistence, completion). The program theory has multiple steps and assumptions that depend on schools, families, and labor markets.

What breaks first is the R test. Long-horizon programs collect dozens of countable signals (attendance, GPA, recommendation counts) that pass the S, M, A, and T tests but do not actually map to the long-term outcomes the program is built around. The dashboard fills with metrics that move but do not inform program decisions.

What works is grading every candidate metric against the program theory before promoting it to the SMART set. A metric stays in the diagnostic layer unless it ties directly to an outcome step. The short list of SMART metrics ends up at five to seven, anchored to the theory; the rest are reported as context for explaining what the five mean.

A specific shape

A youth development organization tracked twenty-three indicators in its annual report. Twelve passed all five SMART tests; eleven did not. The eleven were demoted to diagnostic context the next year. Funder conversations got shorter and more focused on the twelve.

03

Impact funds and portfolio investors

Portfolio-level reporting across heterogeneous investees. Roll-up metrics that have to be defensible per fund and per LP report.

The typical shape is a fund with twenty to fifty portfolio companies, each reporting different operating metrics. The fund needs SMART metrics that roll up to the portfolio level for LP reports while staying defensible at the company level for diligence and review.

What breaks first is the S test, but at a different layer. Each investee defines metrics in its own context, so a portfolio-level "jobs created" or "carbon avoided" rollup mixes apples and oranges unless the fund standardizes the definition at the time of investment. By the time the fund tries to standardize after the fact, every investee already has its own version of the metric.

What works is a definition contract signed at investment closing, referencing IRIS+ codes or a fund-specific metric dictionary. The investee reports against the dictionary, the fund rolls up clean numbers, and LP reports defend the rollup back to a documented definition. The S test passes once at investment, not every quarter.

A specific shape

An impact fund with thirty portfolio companies reduced its annual LP report definition disputes from weeks of email threads to a single review meeting once each company reported against a shared metric dictionary anchored at closing. The dictionary did the work the rolling matching used to do.

Vendor note

Tools that collect data, and the architectural gap that breaks the M and A tests

Google Forms SurveyMonkey Typeform Qualtrics Sopact Sense

Forms tools handle collection well. They render questions, capture responses, and export to CSV. The architectural gap is that none of them carry a stakeholder identity across instruments, link to program records, or surface a baseline at the moment of metric design. The M test (a unit and a source) and the A test (a baseline and a target) both require those capabilities, and a forms tool plus a spreadsheet supplies them only by matching by hand each reporting cycle.

Sopact Sense was built to close that gap. Stakeholder IDs persist across instruments, program records and survey responses live in the same workspace, and metric definitions are tagged to the program theory at design time. The five SMART tests pass because the parts that defend them are held together structurally, not procedurally.

FAQ

SMART metrics questions, answered

The questions program teams ask most often when defining SMART metrics for the first time, or rewriting metrics that did not survive the last review.

Q.01

What is the definition of SMART metrics?

SMART metrics are program or business measurements that pass five tests at once: Specific, Measurable, Achievable, Relevant, and Time-bound. The acronym originated in a 1981 management paper by George Doran and is now applied across goal setting, performance indicators, and impact measurement. A metric that passes all five tests can be defended back to a source row of data and forward to a decision the program team needs to make.

Q.02

What does SMART stand for in the context of setting metrics?

In the context of setting metrics, SMART stands for Specific, Measurable, Achievable, Relevant, and Time-bound. Each letter is a separate test the metric has to pass. Specific catches vague wording. Measurable catches numbers that no instrument can produce. Achievable catches targets without baselines. Relevant catches metrics that do not match the program theory. Time-bound catches measurements that never report.

Q.03

What are SMART metrics?

SMART metrics are the metrics a program team can defend in a board meeting, in a funder review, or in a planning session three quarters from now. They name what is being counted, where the number comes from, what level of change is realistic against a baseline, why the metric matters to the program, and which window the count covers. Most metrics fail at least one of these tests.

Q.04

What is the SMART framework?

The SMART framework is a five-test checklist for goals, objectives, and performance indicators. It is one of the most cited frameworks in management literature and shows up in MBO programs, KPI design, OKR coaching, and impact measurement guidance. The framework does not generate a metric on its own. It is a filter applied to a draft metric to find which letter the metric fails.

Q.05

What is a measurable metric?

A measurable metric is one that names a unit and a source of data. The unit is what gets counted: people, dollars, days, sessions, placements. The source is the system or instrument the count comes from: an enrollment record, an exit survey, a payroll report. If a metric cannot be traced to both a unit and a source, the M in SMART has not been satisfied and the number cannot be defended.

Q.06

What is the difference between a metric and a SMART metric?

A metric reports a number. A SMART metric defends that number against five questions: what exactly is being counted, where does the count come from, is the target realistic given the baseline, does the metric match the program theory, and over what window does the count apply. Most reporting failures happen because a team published a metric without applying the five tests first.

Q.07

Can you give SMART metrics examples?

A non-SMART metric: We improved job outcomes. A SMART version: Eighty percent of cohort graduates report a job placement matched to their training within six months of program completion, against a prior-cohort baseline of sixty-three percent. The second version names the unit, the source, the baseline, the relevance to the training program, and the six-month window. A board reviewer can act on the second version. The first one starts a meeting about what was meant.

Q.08

What is SMART criteria for performance indicators?

SMART criteria for performance indicators apply the same five tests to KPIs as to goals. A SMART performance indicator names the population it covers, the data system it pulls from, a baseline against a target, the link to a strategic outcome, and the reporting window. Indicators that name only an aspirational direction (increase, improve, grow) fail the Specific and Measurable tests and produce reports the team cannot act on.

Q.09

What is an actionable metric?

An actionable metric is a metric that, once it lands in front of a decision maker, points to a next step. SMART is the structural test. Actionable is the consequence: a metric that is specific, measurable, anchored to a baseline, relevant to the program theory, and tied to a window almost always produces an action when the number moves. Vanity metrics fail because they pass none of the five tests and therefore point to no action.

Q.10

How do SMART metrics apply in monitoring and evaluation?

In monitoring and evaluation, SMART metrics are how output indicators, outcome indicators, and impact indicators get written so they can be reported quarterly without arguments. A logframe row that names a SMART indicator avoids the most common M and E failure: a quarterly review where the team disagrees about what the indicator was meant to measure in the first place.

Q.11

How is SMART used in data analysis?

SMART is used in data analysis as a pre-flight check on the metric definition before any computation runs. Analysts apply the five tests to confirm the metric maps to a column in a data system, has a defensible filter for the population, names a baseline window and a comparison window, and ties to a question the business actually needs answered. SMART does not replace statistical methods. It catches definitional errors that statistics cannot fix later.

Q.12

How does Sopact Sense help build SMART metrics?

Sopact Sense captures survey responses and program records under one stakeholder ID, so any SMART metric can be defended back to the source row. The Specific test gets a named field. The Measurable test gets a tracked instrument and unit. The Achievable test gets a baseline pulled from prior-cohort data. The Relevant test gets a tie to the program theory captured at design time. The Time-bound test gets a collection window the platform enforces.

Q.13

Can I use Google Forms or SurveyMonkey to build SMART metrics?

Forms tools collect responses. They do not enforce a stakeholder ID across forms, do not link to program records, and do not surface a baseline at the moment of metric design. Teams using Google Forms or SurveyMonkey to build SMART metrics typically end up matching exports by hand in a spreadsheet, which is where the M and the A in SMART quietly fail. The collection part is fine. The defensibility part needs a system that holds the parts together.

Related guides

Methodology guides that pair with SMART metrics

The R test in SMART is "does this metric tie to the program theory." These sibling guides cover the theory and the data structure that make the Relevant test answerable.

Working session

Bring a metric. Leave with a SMART version you can defend.

A sixty-minute working session. Bring one metric your program reports on today. We walk it through the five tests on screen and show what it would look like in Sopact Sense, with the named field, the tracked instrument, and the linked baseline in place. No procurement decision is involved. The output is a marked-up metric you can rewrite into your next quarterly report.

Format
60 minutes on a video call. Screen sharing both ways.
What to bring
One metric you currently report on. The vaguer it is, the better the example.
What you leave with
A rewritten SMART version, a baseline plan, and a sketch of where each test lives in your data.