Which visualization should I choose? Main concepts

Felipe A. Moreno
5 min readMar 17, 2021

At the end of this articles, you will know which visualization choose.

Chart chooser. source

Hello fellows!!
In this occasion, I gonna explain and detail the main concepts of visutalization.

Types of information and datasets

It can be statistical levels of measurement such as nomical (categories), ordinal (rank order, survey scales), interval (percent, temperature), or ratio (weight, height).
According to Tamara Munzer (2014), there are 4 common dataset types, which are tables, networks, fields and geometry. The table is considered as one of the dataset types instead of a visualization chart type. Therefore, it should not be used for expressive and effective data visualization.

Source: Tamara Munzer (2014)

Types of attributes

Quantitative, sequential, categorical.

Basic concepts:

In visualization, we can use Marks and channels to build blocks for visual encoding.

Marks: basic geometric elements that depict items and links.
Mark Types:

  • Item marks: can be classified according to their spatial dimensions: 0D -> points; 1D -> lines; 2D -> areas, etc.
  • Link marks: show relationship between items
    - Connection marks: show pair wise relationship
    - Containment marks: show hierarchical relationship
Marks. Source: Tama et al

Channels: control the marks’ appearance and encode properties of a mark.
Channel Types:

  • Identify channels (categorical data): what something is and where it is (circle, triangle, cross, etc.)
  • Magnitude channels (ordered data): how much something there is (length, luminance, etc.)
Marks and Channels. Source: Tama et al

Expresiveness

The visual encoding should express all of the information in the dataset.
This means “what can be expressed”. Is the type of information that can (or cannot) be expressed with a channel. In other words, whether a channel can express information about quantities, sequences, and categories, and the same time it does not express information not intended to be express.

Effectiveness

The importance of the attribute should match the salience of the channel.
This means “how weel it can be expressed”. Is the channel that can be effective if it can represent information accurately. In other words, how accurately a channel can express quantitative information.

Effectiveness. Source: Tama et al

It can be single measuring the accuracy (estimate magnitudes), or discriminability (number of values one can distinguish). Or multiple such as salience (attracting attention), separability (interference between channels/tuning attention), and gouping (pattern formation):

  • Accuracy: How close is human perceptual judgment to some objective measurement of the stimulus?.
    Aligned position > unaligned position > length > angle > area > volume > curvature > luminance > hue.
  • Salience: How to direct the attention?.
    It should be possible playing with the channels using colors, positions, shapes, etc.
  • Separability: How to channel interacts between them?.
    Not all channels are independent. It should be easy to tune the attention to only one regardless the others channels.
  • Discriminability: How many distinguishable levels (bins) in the channel?.
    It can be possible to focus at most 5–10 values in a single chart.
    When we have a lot of values, is possible to group (hierarchy), filter (select elements), and/or faceting (split them individually).
  • Grouping: How visual elements compete in patterns?.
    Select proper channels that allow visual grouping or visual clustering.
    It can be done using Connections (links), Enclosure (contours), Proximity (position distance), Similarity (shape, color), Continuity (lines, groups), and Closure (complement information).

Colors

In this section, we will explain the color and how to represent them. Also, we detail some concepts and main ideas about how to explote the utilization of colors in visualization.
The main purpose to select a good color is to quantify, labeling, and emphasize/highlight.

There are a color types, listed below:

  • RGB (red-blue-green): Additive mixing, not intuitive and not linear perception, mix to create white.
  • CYMK (cyan-magenta-yellow-blacK): Substracting mixing, not intuitive and not linear perception, mix to create black.
  • CIE L*a*b (lightness-a-b): a goes from green to red/magneta and b goes from yellow to blue. It is non-linear but perceptually linear.
  • CIE LCh: Also called HCL (hue-chroma-luminance), transform the cartesian coordinates cie L*a*b into cylindrical coordinates.
  • HSL (hue-saturation-lightness): Most used and useful way to describe colors.
  • HSV (hue-saturation-value): Also called HSB (B for brightness).

Where Hue means the color type (name/angle), Saturation means the radius (from white to pure color), Brightness/Value means the height (from black to pure color), Lightness means the height (from black to white passing through the pure color), Chroma means the radius (relative saturation), Luminance means the height (same as lightness in CIE L*a*b).

Depending on the PoV, we can use the RGB cube or the CYMK cube space color.
HSL vs HSV/HSB. Source: Website.
CIE L*a*b to HCL, taking the angle and radius as chroma.

From this, we can compare how intuitive and perceptually uniform they are. In Figure below, we present a table summarizing these two aspects (CYMK behaves similar to RGB).

Source: Specialized program: Information visualization

Single-hue to show quantity, heatmap, and labels.
Multiple-hue to show aesthetics, higher discriminability, and segmentation or labeling.

Categorical color space is used to uniformity (nothing stands out) and discriminability (can be 5–10 colors). A good categorical color sapce in HCL: keep chroma and luminance as constants, then sample across hue.

Diverging color space is used to distinguish values aboves and below a threshold. A good diverging color sapce in HCL: same as categorical + keep the same luminance “ramp” on both.

Color convention: red is bad, green is good, and gray is unspecified.

Size observations: small areas -> high saturation, large areas -> low saturation.

Contrasts: difference in colors that make objects distinguishable (Luminance is the most important).

Luminance for contrast, Source: Specialized program: Information visualization

Main issues are color blindness: a) deuteranopia: cant distinguish red from green; b) tritanopia: cant distinguish blue from yellow.

Main Tools: ColorBrewer, ColorPicker.
Resources: Silhouette

References

https://datavizcatalogue.com/blog/chart-selection-guide/
https://policyviz.com/2016/11/30/style-guides/
https://extremepresentation.typepad.com/blog/2006/09/choosing_a_good.html
https://policyviz.com/2014/02/05/a-visualization-mapping-form-and-function/
http://excelcharts.com/classification-chart-types/
https://datavizcatalogue.com/blog/chart-selection-guide/

--

--