Our Data Pipeline allows you to perform transformations on your query results in a series of steps. These steps include a variety of operations such as column sorting, pivoting data, and adding calculated columns. The flexibility of adding any steps in any order allows you to get your data exactly how you want it.
Adding a Step
Click the button in the pipeline to add a new transformation step to your query results. You can apply as many steps as you'd like, in any order. You can also use each step multiple times. The available steps and examples are outlined below.
Data Pipeline steps are applied in order from first to last, so in some cases it is possible for a later step to override an earlier one.
To view your data at any point in the pipeline, click a Pipeline step to open it. Then, click Show Input/Output Data. It can be incredibly useful to compare how your data looks before and after a transformation step is applied.
View the executed query by clicking the Executed query link next to the Run Query button.
The Executed Query will show you the query exactly as it is sent to your database. This is especially helpful for charts using dashboard variables, as it includes the variable values that are inserted into the query. You can also see any timezone offset that has been applied.
Data Pipeline Steps
Add a calculated column to your query results. Use one of our built-in functions, or choose Custom Formula in the Formula Type dropdown to create your own. See Formulas to view some examples of functions you can use.
The Bucket Data pipeline step divides the values from one column into a series of ranges, and then counts how many values fall within each range. This data format can be used to create a histogram chart.
Histograms generally use COUNT as the Aggregation, but you may also choose MIN, MAX, SUM, or AVERAGE.
You have three options for how the bucket ranges are defined: Standard Deviation, Bucket Size (amount in each range), or Number of Buckets.
Add a column to your query results (or edit an existing column) where values are set based on certain "if... then" conditions.
Dashboard variables can be used in any part of a Case Statement pipeline step.
To pass through column values in THEN or ELSE, select Column from the dropdown and choose your column.
Combine columns either by applying a mathematical operation or by concatenation. Check the Hide Combined Columns checkbox to hide your original columns.
Use this step to edit an existing column. All column types (text, date, number, etc.) allow entering a custom formula, and numerical columns have additional formula types including simple math operations and rounding.
The Extrapolate Data step provides predictive data for a column in your query results. Choose whether to add data to the existing column or create a new one, the number of rows returned, and the type of extrapolation used (Cubic, Quadratic, or Linear).
Filter rows on certain conditions. Choose Include or Exclude, then choose:
- Matching all conditions
Uses AND logic. Row will be included or excluded only if it matches every condition.
- Matching any conditions
Uses OR logic. Row will be included or excluded if it matches one or more conditions.
Regular Expression matching
Any text column supports filtering by Java-style regular expressions in the Filter Rows
Pipeline step. Matching is case-sensitive and unicode-aware by default. For case-insensitive matching, add (?i) to the beginning of your regular expression.
Group works the same as GROUP BY in SQL. It allows you to group your query results based on a certain column. You'll need to choose which column you want your chart to be grouped by, and which aggregation you would like for the other columns. Aggregation types include SUM, AVG, CONCATENATE (comma-separated list), COUNT, COUNT DISTINCT, MAX, and MIN.
Hide one or more columns in your query results. If you need to use hidden columns in a Formula, hide the columns as the last step.
Limit the number of rows included in your query results. Can be useful when applied after a sort—for example, sort sales descending and limit 10 to include the top 10 sales. Offset allows you to select a specific range of rows. For example, to include only rows 10-30, set the limit to 20 and the offset to 10.
If your chart has one measure grouped by two dimensions, such as count of activity grouped by month and by activity type, you'll want to pivot the data so you can use it in a chart.
Pivot tables require exactly 3 columns. Your data is pivoted on the second column-e.g., the second column becomes the column headers.
Select Ascending, Descending, or None in the Column sort direction dropdown to customize the column order in your pivot table. Aggregate Function allows you to choose the aggregation for duplicate group values. Available aggregations are: SUM, AVG, MIN, MAX, GROUP_CONCAT, Auto. Auto will choose either SUM or GROUP_CONCAT, depending on the data type.
Rename one or more columns. Useful for display purposes, or shortening column names for ease of use in formulas.
Easily click and drag your column names to reorder how they will appear in your chart. Note: it's generally best to reorder columns in one of the last steps, as any editing that involves column changes will affect your reordering settings.
Apply a sort to one or more columns.
Unpivot data is useful when you have multiple single value columns that you want to use in a visualization, such as a pie, line, or bar chart.
If each layer returns a single value, choose Cross Join as your merge type.
Zero Fill Data
Zero fill fills in missing values in your data with zeros. It's important when charting a time series with missing dates, or when performing calculations on columns that contain NULL values.
Zero-fill will fill in all date and numeric columns in your query results.
The Add Column and Edit Column Pipeline steps allow you to use formulas to transform your data. You can either use one of our built-in formulas as-is, or write your own formula using the Custom formula option.
If you aren't seeing a formula type you expect to see for an existing column, check the column's data type. Some formulas are only available for numeric columns.
Tip: When dividing integers, be sure to multiply the top value by 1.0 to convert your results to a decimal. For example:
(1.0 * "Count of Clicks")/"Ad Cost"
Add a numerical value to a column. Enter a column name to add a column value instead.
Creates a new column with an aggregation value of another column. Choose from SUM, MIN, MAX, or COUNT.
Useful for applying calculations to a column that involve an aggregated value. For example, use SUM aggregation, then Column Ratio in the Add Column pipeline step to determine a column value's percentage of the total column.
Easily divide one column by another to get a ratio. Integers are auto-converted to decimals to ensure decimal accuracy.
|Moving average||moving_avg("Column name", # trailing rows)|
|Running total||running_total("Column name")|
|Percent change||percent_change("Column name")|
|Format||format("Column name", format string)|
|Lag||lag("Column name", offset)|
|Round||round("Column name", # decimals)|
|Coalesce||coalesce("Column name", replacement char)|
|Cast||cast("Column name" as datatype)|
acos, asin, atan, atn2, atan2, acosh, asinh, atanh, difference, degrees, radians, cos, sin, tan, cot, cosh, sinh, tanh, coth, exp, log, log10, power, sign, sqrt, square, ceil, floor, pi
replicate, charindex, leftstr, rightstr, ltrim, rtrim, trim, replace, reverse, proper, padl, padr, padc, strfilter
stdev, variance, mode, median, lower_quartile, upper_quartile
In the formulas below, units can be one of the following: day, month, year, hour, minute, second, quarter (datepart() only), dayofyear (datepart() only). For more details on these date functions, see our Knowledge Base article.
Divide a column by a value, or by another column.
Used to specify decimal precision, useful for force-formatting currency. Format returns a string, not an integer. Therefore, it is best used in table charts or Single Value charts.
Access data from a previous row in the same column. Row offset defines how many rows back you want to use.
Generates a series of averages of your data to create a smooth trend line. Trailing rows is the number of previous rows to include in each row's average.
Multiply a column by a value, or by another column.
Determines percent change between current and previous row.
Ratio of total
Displays each row's value as a ratio of the column sum.
Round is used to round to a specific number of decimal places. Note: you cannot round a number to a higher decimal precision than it already has. For example, you cannot round the integer 10 to have two decimal places.
Sums up all previous results of a column.
Subtract a value from a column. You may also subtract a column from another column.
Total column sum
Creates a new column where each value is a sum of its row.