For a recent experiment, we were monitoring temperature and pressure. The experiment was controlled from a C# program connected to a Measurement Computing multifunction card. Being C#, the CloudTurbine (CT) API was not directly accessible, so I had the application manually create an output folder structure to hold the packed data files and then view the data using WebScan. The figure below shows two file structures that I used: initially, the malformed Case A (which “works” but didn’t give the desired result) and then the correct file structure shown in Case B. The figure shows our “temperature” data files. We are using the binary “.f64” packed data format, where in our case each file contains 100 double-precision data values collected at 1 Hz.
Here are a few things to keep in mind about the CT file structure:
- First, read our introduction to CT File Structure for an excellent overview of CloudTurbine file structure; this article is the standard to reference.
- Second, the examples presented here are for packed data files (multiple data points per file) and we use relative timestamps in the folder hierarchy. Absolute timestamps can also be used at all levels of the file hierarchy, in which case the interpretation of start and end times is slightly different than what we present here.
- Third, when using relative timestamps, there are 3 times to consider:
- The “base” time, which is the absolute timestamp indicated by the top folder in the hierarchy; 1472049334 for the cases shown above (this is an epoch timestamp which corresponds to 8/24/2016 10:35:34 AM).
- The “start time”, or timestamp of the first data point in the packed data file. In a multi-folder relative time hierarchy such as the examples shown above, this is calculated by summing up the times indicated by the folders in the hierarchy starting from the “base” time down through the folder structure but NOT including the relative time indicated by the folder which contains the data file itself. We’ll give examples of how this is calculated below.
- The “end time”, or timestamp of the last data point in the packed data file. In a multi-folder relative time hierarchy such as the examples shown above, this is calculated by summing up the times indicated by the folders in the hierarchy starting from the “base” time down through the folder structure including the relative time indicated by the folder which contains the data file itself. Again, we’ll give examples of how this is calculated below. Another way to look at this: the relative time indicated by the folder which contains the data file (i.e., the bottom folder in the hierarchy) is the duration of the data in the packed file.
With these rules in mind, let’s go over these two examples and examine the output data from WebScan:
Case A: malformed file structure
This was my first attempt at setting up the relative time hierarchy for packed data. I expected the temperature data to look something like that shown in the figure below.
However, what WebScan displayed is shown in the figure below.
What is WebScan displaying? Consider the path to the first packed data file: “1472049334\100\temperature.f64”. CT interprets this as follows: the base time is 1472049334; the start time is also 1472049334; the end time is 1472049334 + 100. Compare this to how CT interprets the path to the second “temperature.f64” file, which has the same base and start time as the first packed data file (1472049334) but in this case the end time is 1472049334 + 200. Thus, the data in the first packed file starts at 1472049334 and is spread over 100 seconds, while the data in the second packed file starts at 1472049334 and is spread over 200 seconds. For the third file, the data starts at 1472049334 and is spread over 300 seconds. Thus, CT will overlap the data for all consecutive packed data files, spreading them out over wider and wider intervals as time goes on. This results in the screwy looking WebScan plot shown above.
Case B: good file structure
Case B shows the proper relative time folder hierarchy for displaying the packed data. Many thanks for Matt Miller for showing me this corrected folder structure. In order to have CT properly interpret start and end times for the packed data files, an additional folder level needed to be introduced into the hierarchy. Two folder levels (as was used in Case A) work fine for interpreting packed data start and end time when using absolute/epoch times as folder names. But when using relative times, as in my case, I needed an additional folder level. The figure below shows this data in WebScan (both temperature and pressure channels are displayed here).
Here are a couple examples of how CT derives start and end time:
- For the first packed data file at 1472049334\0\99\temperature.f64, the base time is 1472049334, the start time is 1472049334 + 0, the interval or duration for the packed data is 99 and the end time is 1472049334 + 0 + 99.
- For the second packed data file at 1472049334\100\99\temperature.f64, the base time is 1472049334, the start time is 1472049334 + 100, the interval or duration for the packed data is 99 and the end time is 1472049334 + 100 + 99.
For my data, each packed file contains data from a 99 second duration. Each file contains 100 data points taken at 1 second intervals. Thus, for the first data file, in terms of relative time, the data points are at 0 sec, 1 sec, 2 sec, …, 99 sec; this gives a file duration of 99 – 0 = 99 seconds.
A final consideration of the folder structure for packed data is that, in general, the folder containing the packed data file cannot have any peer folders with a common parent. Said another way, the folder containing the packed file must itself be an “only child” to its parent folder. For example, consider Figure 3 in the CT File Structure document: when packed data is being stored, the folder at the “Blocks” level should only contain a single child “Points” folder. This makes sense, since it would be odd to have 2 or more packed data files for the same channel with the same start time.
Note that when using the CloudTurbine Java API, you won’t tend to make the same mistake I did, as the API will implement a well-formed folder hierarchy for you. However, hopefully reading through this description provides some insight into the data folder structure and how it is used to assign timestamps to packed data.