Node Variables

Overview #

Node Variables are attached to datasets, which are represented by nodes in the Data Pipeline. Node Variables must be inserted into the queries and scripts configured for those nodes in the pipeline. Once inserted, end users will be able to alter the underlying queries and scripts by passing in different values for the Node Variable.

The values that get passed into these underlying queries need to be constrained so they don’t break things. This is handled through the concept of Variable Constraints.

Adding Node Variables to a Query #

The primary reason for using Node Variables is interactivity. Static queries produce static data, which in turn create static visualizations. End users often want to alter the way data is filtered, adjust time ranges, or compare different metrics. Essentially, end users want to interact with data. Variables, along with Variable Constraints, enable that interactivity.

Adding interactivity starts with the queries and scripts used inside of feed and transform nodes in the pipeline. In order to be interactive, these queries and scripts need to use variables. We refer to these types of variables as “Node Variables”, because elements are visually depicted in the data pipeline as a hierarchy of nodes.

The example below is representative of all feeds and transforms that take SQL or JavaScript input. Variables are introduced into queries by using the Insert > Node Variable button. To better describe what is happening, assume the basic SQL statement shown below:

The syntax above will return all records that have a value of “Arches” for the “Park” attribute (column). This is great if you happen to be interested in a park called “Arches“, but that is all this query will ever return.  What if you are interested in a different park?  What if you want the user to be able to choose whatever park they want?

When an administrator desires interactivity, they will need to click the Insert > Node Variable button.  The example below shows what the resulting query would look like after replacing the static value “Arches” with a node variable named “ParkName“.

Choosing a Variable #

After clicking the Insert Variable button, the Choose a Variable dialog will appear. This dialog lets you reuse variables that have already been created for the current connection.

  • If you do not see anything, then click the  +  icon to add a new variable for this connection.
  • If there is already a relevant variable defined, then you can simply select it from the list and reuse it without any additional configuration.

Node Variable Scope

Node Variables are bound to a specific node in the system. Node Variables are inherited by all descendants, which appear as downstream nodes in that pipeline. So, if you create a variable off of one feed, and then switch to a peer feed, you will not see any existing variables on that new feed. You can reuse the same Variable Constraints across both feeds, just not the same Node Variable. For more information, see “Adding a New Variable” below, or the section on Variable Constraints.

Adding a New Variable #

Adding a new variable is simple.  It consists of adding a new “Variable Name”, and then mapping that name to an existing “Variable Constraint”.  A bulk of the configuration for Node Variables is contained in the Variable Constraint, which means that most of the configuration is global.  Only the name is bound to an individual node.

Property
Description
Name A symbolic name used to represent this variable. This is the name that will show up in the Choose a Variable dialog.
Unused Mark the variable as ‘unused’ to bypass the node variable integrity check for unused node variables. Variables should only be marked as unused in a very rare condition. Example: using the variable only inside of the HTML Template Visualization.
Variable Constraint Limits what values can be used for a variable. An end user will only be able to select values specified by the variable Constraint. See the Variable Constraints section for more information.
Default Value What will be used when initializing a query. The default value is constrained by the Variable Constraint you have selected. In most cases, you are choosing an element from an array of valid values.

Protecting Against SQL Injection #

edgeSuite v3.2 introduced a new “safeNodeVar” syntax to use inside of SQL queries. This new syntax protects against malicious SQL injection.

The old “nodeVar” syntax used in edgeSuite prior to v3.2 used direct STRING replacement, and was vulnerable to malicious SQL injection. That “nodeVar” syntax has been deprecated in edgeSuite v3.2, but edgeSuite does not alter any existing SQL queries when upgrading archives as it could break Visualizations. Administrators are strongly encouraged to adopt the new “safeNodeVar” syntax moving forward.

If safeSubstitution is turned off, then nodeVars will behave the same as they did in previous versions of the product. The old concept of a “nodeVar” offers a lot of flexibility, however unbounded STRING values passed in for existing nodeVars could leave a system vulnerable to a SQL injection attack.

SafeSubstitution is set to OFF when importing an archive that was created in an edgeSuite version prior to v3.2, and which contains a nodeVar in any SQL query.

Safe Substitution #

  • When safeSubstitution is OFF, nodeVars will behave the same as in previous versions of edgeSuite.
  • When safeSubstitution is ON, node variables used in SQL queries will be updated to the “safeNodeVar” syntax, which helps protect against malicious SQL injection.

The following list provides various scenarios regarding upgrade logic for existing archives:

  1. Fresh edgeSuite v3.2+ installation:
    Safe substitution is enabled
  2. An archive created prior to edgeSuite v3.2, but no nodeVar references in existing SQL queries:
    Safe substitution is enabled
  3. An archive created prior to edgeSuite v3.2, and with at least one nodeVar reference in an existing SQL query:
    Safe substitution is disabled 
  4. An archive created in edgeSuite v3.2+:
    Safe substitution is configured according to what is in config.sql in the archive.

    1. If the archive originally came from upgraded content with “nodeVars”, then Safe Substitution is disabled.
    2. If the archive was created from a fresh edgeSuite v3.2+ installation, then Safe Substitution is enabled.
The deprecated “nodeVar” syntax is still available once safe subscription is enabled, but only by manually entering the correct syntax into a query. The “Insert NodeVar” button will assume the “safeNodeVar” syntax.

Manually Enabling Safe Substitution #

For archives using the deprecated “nodeVar” syntax, which is anything created prior to the edgeSuite v3.2 release, an administrator can run a command to update all queries so that they are protected against malicious SQL Injection. Running the command below will enable the new “safeNodeVar” syntax on all SQL queries.

bin/edge.sh config -s global -k pipeline.safeSubstitution -v true

The following table provides a summary of how edgeSuite v3.2+ protects against SQL Injection:

Ref
Constraint
Dynamic
safeSubs
Allowed
Notes
safeNodeVar any any any yes safeNodeVar always allowed – this will use a placeholder – this SQL may need to be expressed slightly differently (typically through use of CONCAT)
secVar n/a n/a any yes secVar always allowed – secVars are currently always admin-defined strings, so the string is just substituted into the SQL
nodeVar unbounded n/a yes no edgeSuite has no way to determine whether the value is safe
nodeVar unbounded n/a no yes same behavior as in edgeSuite versions prior to v3.2 – no checks
nodeVar bounded static yes yes edgeSuite will check the value against the static list
nodeVar bounded dynamic yes yes edgeSuite will check the value against the dynamic list – it is up to the administrator to determine whether or not the set of derived values is safe
nodeVar bounded static no yes same behavior as in edgeSuite versions prior to v3.2 – no checks
nodeVar bounded dynamic no yes same behavior as in edgeSuite versions prior to v3.2 – no checks