In our last article, Data Matters, we spoke about the “The 4 V’s of Big Data” and how people at every level of your organization, from data scientists to domain experts, need to find ways to be more data-driven so they can make better, more informed decisions. We listed three key ways that you can extract value from your data: Data Discovery, Data Analytics and Machine Learning. In this article, we will discuss the first two: Data Discovery and Data Analytics, both through Smart Data Visualization.
Smart Data Visualization
You’re probably familiar with data visualization. Data visualization architectures that produce interactive reports to explore data from multiple sources are nothing new. Data visualization has historically been a self-driven process where you start by picking the analysis you want to perform and then pick the best chart or other visualization to match the resulting data structure(s). Hopefully you learn something for your efforts, and then you come up with a new idea and repeat. The term “slice and dice” is often used to refer to this process — systematically breaking down the information available to examine it from different viewpoints so that you can understand it better and visualize it in a variety of different and useful ways.
This can be a fairly slow process, and what happens when you’re not sure of the best ways to visualize your data? Smart Data Visualization was introduced to solve these “too much data, too many visuals” problems. Smart visualization architectures pre-analyze your data to determine the most informative data and present a wide array of relevant visualizations so you can readily see how your data may be best visualized. Think of it as virtual assistant with detailed knowledge of data types, statistics and graphical design.
Data discovery has become a mainstream architecture…”
The smart data visualization process is highly automated and starts from your data. Previously, if you wanted to visualize your data, you might need to go through the hit-or-miss process of selecting specific analysis and then trying various visuals to present the results. With Smart Data Visualization, now you simply start with your dataset and choose some columns of interest. You are immediately presented with the best set of analyses that apply based on the data types and statistical profiles of the selected data. Modern architectures and in-memory / cloud processing have made this automated and highly responsive, leading to new levels of data discovery by users with and without data science backgrounds.
The smart visualization experience can also guide you to selecting the best columns by analyzing which columns in the dataset make the best measures to display and which make the best dimensions of analysis. Some general rules apply:
|Aspects of Measurability||Aspects of Dimensionality|
|Does it have a quantity? |
Examples: Prices, Sizes, Weights
|Is it a category or enumeration? |
Examples: Brand, Color, Type
|Does it lend itself to aggregation? |
Examples: Counts, Sums, Averages
|Does it lend itself to grouping into ranges? |
Examples: Time, Age, Region
|Is it a dependent variable? |
Examples: Profit, Temperature, Efficiency
|Is it an independent variable? |
Examples: Costs, Pressure, Speed
Use Case: Exploring Real Estate Price Data
Here is a brief example of these concepts. Let’s say we have some real estate data which includes information about price, square footage, number of beds, bathrooms, and other things:
In this example, Price, Square Footage, and the calculated Price per Square Foot are quantities that are relatively unique to each sample. The zip code, number of baths and number of beds are limited to a small set of distinct values, so they make good candidates for grouping the prices and other measures. Similarly, the last sale date is a time, so it lends itself to discrete groups, like year of sale.
|Last Sale Date||**|
Let’s say you’re interested in finding out more about the prices in the data. You could begin by looking at visualizations of the prices.
And then you could see how price interacts with other variables based on their measurability, like square feet.
You could even see how the price relates to the zip code. Since zip codes have geolocation information, in addition to standard charts, smart visualization could suggest viewing this as a map.
The important thing here is that Smart Visualization put these options at your fingertips. All you needed to do was select the columns and the appropriate visualizations were displayed immediately. And while we’re only showing two sample visualizations here, there are typically a large number of options available depending on the characteristics of the data, often sorted based on how well each analysis and visualization matches your data.
Extending Smart Data Visualization with Advanced Analytics
The great thing about smart data visualizations is they naturally lead into more advanced data analysis. For example, lots of different ways to analyze and visualize stock data have been developed over the years. Smart data visualization platforms can automatically apply these techniques and include them in the available visualizations. Similarly, techniques developed for the stock market or statistical analysis or even signal processing and system theory can be applied to data from other fields to visualize them in new and insightful ways.
The best part is — you don’t need to know the statistics, technical analysis or signal processing theory to use them! Smart visualization platforms can automatically include these advanced analyses with the more straightforward analyses so you can quickly see your data from a variety of different perspectives. Even when you do understand all of the theory and can do the math, smart visualization platforms can significantly accelerate your work through automation and allow you to extend the types of analysis and visualizations available to you and your less technical colleagues. And it is here where you really start to get into data mining and extracting the hidden value in your data.
Some examples of adding advanced analytics to smart data visualization include:
Meet nD, again.
As we mentioned in Data Matters, nD was created as a solution for big data problems and much of the nD experience is centered around smart data visualizations extended with advanced analytics. As soon as data is brought into nD, it is automatically synthesized and analyzed so that you can immediately launch into the data discovery process. The nD interface includes automation wizards, one of which is dedicated to smart data visualization, to explore your data and identify informative insights.
Get Complete Control with the nD Math Language
Wizards are just the beginning. They provide a great starting point for exploring the power of nD and for many users it may be all the power they need. But what can you do when you or your team need more control over your data and visualizations?
At the heart of nD is a power math language that runs on a cluster of computers in the cloud. Behind the scenes, the Visualization Wizard (like all nD wizards) simply writes the nD math that tells the system what data to access, what operations to apply and how to configure the visualizations to be displayed. The nD math language is a powerful data-driven language with extensive data, math and visual operations. Anything you can do in the wizards can be done through direct entry of nD math. And once you are comfortable with it, the nD math language empowers you, your ideas and your data with limitless possibilities.
To help bridge the gap between what you can accomplish with the wizards and learning the nD math language, each of the nD wizards allows you to view the nD math it generates with the click of a button. It does this for several reasons — first, it quickly shows you how to accomplish things directly in nD. And beyond this, it allows you to directly modify the output of the wizards to give you complete control over how you interact with your data, regardless of your level of expertise.
Let’s say you like the box plot of the Price the nD Visualization Wizard created, shown above, and you had already learned how the wizard created a simple line series in different visualizations, then you could modify the nD math of the generated box plot to add an additional line series set to the mean price.
As powerful as the nD Visualization Wizard is, there will be some analyses and visualizations specific to your type of data that may not initially be included. Maybe your manufacturing process has an industry-specific analysis. Or maybe a specific type of visual is commonly used only for your type of data.
The great thing about the nD platform is that the analyses and wizards can be extended to meet your specific needs. All of the visual elements can be addressed from nD math as well, allowing you to generate powerful visuals unique to your business. And best of all, once created, these custom analyses and visuals can be added to the Visualization Wizard and other wizards to automatically be available for everyone in your organization as part of the Data Discovery process.
The nD visualization tools go well beyond standard chart types, including a full 3D rendering engine with animation and a comprehensive set of widgets to create web applications. And don’t forget that nD’s cloud-based in-memory cluster is powerful enough to extend smart data visualization to any size dataset and complex analysis techniques, responsively.
Use Case: Visualizing Massive Power Simulations
For example, one of our customers, Black & Veatch, used nD to run a massive simulation of Hawaii’s electric power system, including all known capital investment options through 2045. These simulations ended up producing more than 425 trillion data points, which they visualized using nD. As one would expect with this massive a simulation, new visualization techniques needed to be developed. The following is a good example of a standard visual developed as part of this work by their data scientists, that was added to their smart visualization arsenal for everyone on their team to reuse.
The Power of Artificial Intelligence
Sometimes even more advanced types of analysis are necessary to extract the value in your data. Machine Learning gives you the ability to model the processes behind your data, often providing the deepest insights. With the latest advances in machine learning, you can detect anomalies in your data, cluster it into similar groups, classify new data based on previous observations, simulate causes and effects, predict outcomes, create recommendations and even optimize your processes for the best performance. With nD, you get the power of machine learning plus smart visualization in one environment.
In the next article in our series, we will talk about how everyone in your organization can take advantage of Modeling & Machine Learning in nD. And then we will wrap up this blog series with Collaboration & Deployment in the Cloud. All while still focusing on the value in your data and all within nD.